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Description 



The genus Staphylococcus includes at least' 20 di^n,? pharmaceut '"" development, among others 
Human Health and S. Aureus 

scalded sk „ syn d™ „ to « stak ^—^X^X^ ™ 

Sams 



P5 



contact with viable bacteria results in mixed colon^n a »hfi * 96neral physiolo 9 ical ^function. After coo * a 
debns on the burn surface ("eschar"), it wb^^^m^T^ ™ V b8 r6S,ricted t0 Ihe "°^S2e' 

and it may reach below the skin, enter the lymphatic and h t T ^ inVade viable tiss ^ below the eschar 
among the most important pathogens t»ta^^?h ™ deVe '° P into "ptlcaernla S aureus^ 

produce severe septicaemia toUnd ' n bUm wound Sections. It can destroy granulation tissuTand 

Cellulitis 

laver C m^^ 

in fact, cellulitis can be one aspec S^JS^l"* & CeMle can lead to sy" m c ,nSn 

S^sand microaerophilic i^JS^S^^f^ 8 " Th ' S C ° ndition WW is caused by a m x u 1 0 f 
The condition often is fatal. ' 3USeS n6Cr0S,s and ,reat ^nt is limited to excision of the necrotic tissue 

Eyelid infections 

Food poisoning 

.n sufficient quan.it,es, ty P ,cally results in severe rj" 1 ^ ^ ln9eS,l0n of « he °™ 
Although the ,ox,ns are known, their n^J^^^^^^^ reputable bacter^ 

Joint infections 



S. aureus infects bone joints causing diseases 
Osteomyelitis 

S. aureus is the most common 



such osteomyelitis. 



causative agent of haematogenous osteomyelitis. The di; 



disease tends to occur i 
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children and adolescents more than adults and it is associated with non -penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 

bones 

5 

Skin infections 

S. aureus is the most common pathogen of such minor skin infections as abscesses and boils. Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections. 
10 Recurrent infections of the nasal passages plague nasal carriers of S. aureus 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body. Infection of such wound thus poses a grave risk to the patient. 
is S aureus is the most important causative agent of infections in surgical wounds. S. aureus is unusually adept at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells then are necessary to cause 
infection in normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia. Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome' (also called toxic epidermal necrosis, Ritter's disease and 
Lyell's disease). This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
25 produce exfoliation(also called scalded skin syndrome toxin). Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost in the process can produce severe injury in young children if it is not 
treated property. 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin. 
The disease can be caused by S. aureus infection at any site, but it is too often erroneously viewed exclusively as a 
35 disease solely of women who use tampons. The disease involves toxaemia and septicaemia, and can be fatal. 

Nocosomial Infections 

in the 1984 National Nocosomial Infection Surveillance Study ( N NNIS M ) S. aureus was the most prevalent agent 
40 of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to drugs of S aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S aureus was unfavorable 
^ Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treateo 
successfully. The emergence of penicillin-resistant strains of S. aureus did not take long, however Most strains of S. 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately, this is not the case 
tor S aureus encountered in community infections. 

It is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
50 poncillmoic acid, and thereby destroys antibiotic activity. Furthermore, the lactamase gene often is propagated episo- 
maiy typically on a plasmid, and often is only one of several genes on an episomal element that, together, confer 



i on 3 with resistance to many other antibiotics effective against this organism including aminoglycosides tetracycline, 
rrioramphenicol macrolides and lincosamides in fact methicillin-rosistant strains of S aureus generally are multiply 

'i ri .ja r osistant 



EP0 786 519 A2 



The molecular genetics of most types of drug resistance in S. aureus has been elucidated (See Lyon etal, Micro- 
biology Reviews 51/ 88-1 34 (1 987)). Generally, resistance is mediated by plasmids, as noted above regarding penicillin 
resistance; however, several stable forms of drug resistance have been observed that apparently involve integration 
of a resistance element into the S. aureus genome itself. 

Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple drugs 
and increasingly persistent forms of resistance begin to emerge. Drug resistance of S. aureus infections already poses 
significant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed. 

Molecular Genetics of Staphylococcus Aureus 

Despite its importance in, among other things, human disease, relatively little is known about the genome of this 
organism. 

Most genetic studies of S. aureus have been carried out using the the strain NCTC8325, which contains prophages 
psill psi12 and psi13, and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450), which is free of 
the prophages. 

These studies revealed that the S. aureus genome, like that of other staphylococci, consists of one circular, cov- 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, plasmids, transposons and the like. 

Physical characterization of the genome has not been carried out in any detail. Pattee et al. published a low res- 
olution and incomplete genetic and physical map of the chromosome of S. aureus strain NCTC 8325. (Pattee et al. 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325, Chapter 11, pgs. 163-169 in. 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R.P Novick, Ed., VCH Publishers, New York, (1 990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn4001, which, respectively, confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE"). 

The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 
investigators. The size of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost all of the genes of Staphylococcus aureus are unknown. Among 
the few genes that have been identified, most have not been physically mapped or characterized in detail. Only a very 
few genes of this organism have been sequenced. (See, for instance Thornsberry, J. , Antimicrobial Chemotherapy 2 1 
Suppl C : 9-16 (1988), current versions of GENBANK and other nucleic acid databases, and references that relate to 
the genome of S aureus such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by S. aureus infection involves the programmed 
expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of 
S. aureus would provide reagents for, among other things, detecting, characterizing and controlling S. aureus infections 
There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome. The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191. 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS 1-5,191. 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1 -5,191. 

The nucleotide sequence of SEQ ID NOS: 1-5, 191, a representative fragment thereof, or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99.9%, identical to the nucleotide sequence of SEQ ID 
NOS: 1 -5, 1 91 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the 
sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited 
to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/ 
optical storage media. 

The present invention further provides systems, particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means. Such systems are designed to identify commercially 
important fragments of the Staphylococcus aureus genome. 

Another embodiment of the present invention is directed to fragments, preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
s aureus genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORFs," fragments which modulate the expression of an operably linked ORR 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or "DFs ' 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EMFs 
10 found 5' to the ORFs, can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity. 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
15 lococcus aureus genome of the present invention. The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector, into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell. 
20 The present invention is further directed to polypeptides and proteins, preferably isolated polypeptides and pro- 

teins, encoded by ORFs cf the present invention. A vaneiy of methods, well known to those of skill in the art, routinely 
may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may 
25 be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them. 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacciniating an in- 
dividual against Staphylococcus aureus infection. 
30 The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention, specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques 
such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. 
35 Such antibodies include both monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an 
immortalized cell line which is capable of secreting a specific monoclonal antibody 

The present invention further provides methods of identifying test samples derived from cells which express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one 
40 or more of the antibodies of the present invention, or one or more of the Dfs or antigens of the present invention, under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom 

in another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
■*5 which comprises: (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
invention and (h) one or more other containers comprising one or more of the foiiowmg:wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
so invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like. Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded 
by one of the ORFs of the present invention: and (b)determininq whether the agent binds to said protein 

....r 3 :'.- ; . ,j,-.';;V5 - ... . j n i n ',j ri b u ;: > ■ i ' ■ A . '"'''OUinlu vmk. 

- \C.-.i^ > ■;. f ' 1; / t ; ■ ' C n 0 ' S H n O 'O r ■ r ^'^CC\H\G COn-imc On vniuu io. OO production proteins or IO COf'Hro 

ooe expression 

The methodology ana technology for elucidating extensive genomic sequences of bacterial and other genomes 
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has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and f unction ; 
including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of 
regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative 
genomic and molecular phylogeny. 

FIGURE 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems 
of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit 
and annotate the contigs of the Staphylococcus aureus genome of the present invention. Both Macintosh and Unix 
platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al., Pro- 
ceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer So- 
ciety Press, Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature 
data extracted from the sequence files by Factura to the Unix based Staphylococcus aureus relational database. As- 
sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file is processed by seqjilter to trim portions of the sequences with more than 2% ambiguous nucleotides. 
The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic 
Research ( TIGR") for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs 
generated by the assembly step is loaded into the database with the lassie program. Identification of open reading 
frames (ORFs) is accomplished by processing contigs with zorf. The ORFs are searched against S. aureus sequences 
from Genbank and against all protein sequences using the BLASTN and BLASTP programs, described in Altschul et 
al., J. Mol. Biol. 215: 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded 
into the database. As described below, some results of the determination and the searches are set out in Tables 1 -3.. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome and analysis 
of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID 
NOS:1-5,191. (As used herein, the "primary sequence" refers to the nucleotide sequence represented by the IUPAC 
nomenclature system.) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences, the 
present invention provides the nucleotide sequences of SEQ ID NOS:1-5,191, or representative fragments thereof, in 
a form which can be readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1-5,191" refers 
to any portion of the SEQ ID NOS:1-5,191 which is not presently represented within a publicly available database. 
Preferred representative fragments of the present invention are Staphylococcus aureus open reading frames ( ORFs"), 
expression modulating fragment ( EMFs") and fragments which can be used to diagnose the presence of Staphyloco- 
ccus aureus in sample ("DFs tt ). A non-limiting identification of preferred representative fragments is provided in Tables 
1-3. 

As discussed in detail below, the information provided in SEQ ID NOS:1 -5,191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
all "representative fragments" of interest, including open reading frames encoding a large variety of Staphylococcus 
aureus proteins. 

While the presently disclosed sequences of SEQ ID NOS:1-5,191 are highly accurate, sequencing techniques are 
not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS:1-5,191. However, once the 
present invention is made available (i.e., once the information in SEQ ID NOS:1-5,191 and Tables 1-3 has been made 
available), resolving a rare sequencing error in SEQ ID IMOS:1-5,191 will be well within the skill of the art. The present 
disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide 
may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the 
art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler 
can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
effort, also of a routine nature, to the region containing the potential error. 

Even if all of the very rare sequencing errors in SEQ ID NOS:1-5,191 were corrected, the resulting nucleotide 
sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great majority would 
be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

As discussed elsewhere hererin, polynucleotides of the present invention readily may be obtained by routine ap- 
plication of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining 
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libraries and for sequencing are provided below, for instance. A wide variety of Staphylococcus aureus strains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present invention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC") 

The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 
5 ever, the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% identical, in 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS1-5J91 Nearly all will be at least 99% 
identical and the great majority will be 99.9% identical 

Thus, the present invention further provides nucleotide sequences which are at least 95% r preferably 99% and 
most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191, in a form which can be readily 
10 used, analyzed and interpreted by the skilled artisan. 

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical 
to the nucleotide sequences of SEQ ! D NOS: 1 -5, 1 91 are routine and readily available to the skilled artisan. For examp le, 
the well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad Sci. USAQ5 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate 
is an identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID N0S:1-5,191, a representative fragment thereof, or a nucleotide 

20 sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide se- 
quence of SEQ iu NOS. i -5, i 9 i may be "provided" in a variety of mediums to facilitate use thereof. As used herein, 
Oprovided" refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention; i.e., a nucleotide sequence provided in SEQ ID N0S:1-5,191, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical 

25 to a polynucleotide of SEQ I D NOS: 1 -5, 1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
genome and parts thereof {e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 

30 readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer. Such media include, but are not limited to: magnetic storage media ; such as floppy discs, hard 
disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 

35 prising computer readable medium having recorded thereon a nucleotide sequence of the present invention. Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 
to generate manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 

~5 readable medium The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data-processor structuring formats {e.g., text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

50 Computer software is publicly available which allows a skilled artisan to access sequence information provided in 

a computer readable medium Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS 1 -5 191 . a representative fragment thereof or a nucleotide sequence at least Q5°- ornforaNv at 'n-ic» QQO -v— 1 



v/G'en^ was used to identify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology to ORFs or proteins from both Staphylococcus aureus and from other organisms Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites 

The present invention further provides systems, particularly computer-based systems, which contain the sequence 
information described herein. Such systems are designed to identify, among other things, commercially important frag- 
ments of the Staphylococcus aureus genome. 

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage 
means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, 
output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a data storage means having 
stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means 
for supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the 
present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the present genomic sequences which 
match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer-based systems 
of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and 
BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems. 

As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 
or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 1 0 to 1 00 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in gene expression and 
protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences). 

A variety of structural formats for the input and output means can be used to input and output the information in 
the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 
target sequence or target motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
means to identify sequence fragments of the Staphylococcus aureus genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Altschul et al., J. Mol. Biol. 215: 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 
readily recognize that any one of the publicly available homology search programs can be used as the search means 
for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present 
invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 
are a main memory 1 08 (preferably implemented as random access memory, RAM) and a variety of secondary storage 
devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage 
device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable 
storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data 
recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes 
appropriate software for reading the control logic and/or the data from the removable medium storage device 1 1 4. once 
it is inserted into the removable medium storage device 114. 

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108. 
any of the secondary storage devices 110, and/or a removable storage medium 116. During execution, software for 
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accessing and processing the genomic sequence (such as search tools, comparing tools, ©to.) reside in main memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs 

5 BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fragments. The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
10 modulate the expression of an operably linked ORR hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs). 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome" 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
is means to reduce, from the composition, the number of compounds which are normally associated with the composition. 
Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID N0S:1-5,191, to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above. 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
20 include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size 
In one embodiment, Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length. These fragments can then be used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below. Primers flanking, for example, an ORR such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191. Well 
25 known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library of 
Staphylococcus aureus genomic DNA. Thus, given the availability of SEQ ID NOS.1-5,191, the information in Tables 
1 , 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1-5,191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or 
other nucleic acid fragment of the present invention. 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA. 

As used herein, an "open reading frame," ORF, means a series of triplets coding for amino acids without any 
termination codons and is a sequence translatable into protein. 

Tables 1. 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
35 identified as putative coding regions by the GeneMark software using organism-specific second-order Markov proba- 
bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists. 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 80 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
jo an S. aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996 

Table 3 sets out ORFs in the Staphylococcus aureus contigs of the present invention that do not match significantly. 
J5 by BLASTP analysis, a polypeptide sequence available through Genbank by September 1996. 

In each table, the first and second columns identify the ORF by respectively contig number and ORF number 
within the contig; the third column indicates the reading frame, taking the first 5' nucleotide of the contig as the start of 
the +1 frame; the fourth column indicates the first nucleotide of the ORR counting from the 5' end of the contig strand; 
and the fifth column indicates the length of each ORF in nucleotides. 
50 in Tables 1 and 2, column six, lists the Reference" for the closest matching sequence available through Genbank 

These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
. n n , 'i n .^^T,q'nrr n^^rrintion^ of t^n "ijmGnclatijro arp available from the National Center for Biotech- 



iarjie 3 ;nc last column column six. indicates ino length ot each OHh in rtmino aco iOSidues 
Tno concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art 
For example two polypeptides 10 ammo acids in length which differ at three amino acid positions {e.g., at positions 
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1 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have 
a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were "similar" 
(i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence 
similarity, such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter Thus, 
for instance, Tables 1 and 2 herein enumerate the per cent identity" of the highest scoring segment pair" in each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations provided below. 

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 
artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other than those listed in Tables 
1 -3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 
pression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 
the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably 
linked ORF in a fashion similar to that found with the naturally linked ORF sequence. As used herein, an "intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between two ORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 
of the present invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a 
cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic 
resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap 
vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the 
expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided 
below. 

A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector. The vector is then transformed into an appropriate 
host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. 
As described above, an EMF will modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences. DFs can be readily identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which de- 
termines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described, but also include allelic and species variations thereof. Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided in SEQ ID NOS:1 -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%, preferably 99% and most preferably 99.9% identical to SEQ ID NOS:1 -5,191, with a sequence 
from another isolate of the same species. 

Furthermore, to accomodate codon variability, the invention includes nucleic acid molecules coding for the same 
amino acid sequences as do the nucleic acid sequences mentioned above. In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. 

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, 
such as an ORF, in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Staphylococcus aureusorigin isolated by using part or all of the fragments 
in question as a probe or primer. 

Each of the ORFs of the Staphylococcus aureus genome disclosed in Tables 1 , 2 and 3, and the EMFs found 5' 
to the ORFs, can be used as polynucleotide reagents in numerous ways. For example, the sequences can be used 
as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus. Especially preferred in this regard are ORF such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most likely to be highly selective for 
Staphylococcus aureus. Also particularly preferred are ORFs that can be used to distinguish between strains of Sta- 
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phylococcus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains. 

In addition, the fragments ot the present invention, as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynu- 
cleotide sequence to DNA or RNA. Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, 

5 while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix -forming oligonucleotides. Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 

10 and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et a!., Nucl. Adds Res. 6. 
3073 (1979); Cooney et al, Science 241: 456 (1988); and Dervan etal., Science 251: 1360 (1991) Antisense tech- 
niques in general are discussed in, for instance, Okano, J. Neurochem. 56: 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION, CRC Press, Boca Raton, FL (1988)) 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 

T5 lococcus aureus genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further 

20 comprise a marker sequence or heterologous ORF operably linked to the EMF 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following vectors are provided by 
way of example. Useful bacterial vectors include phagesenpt, PsiX174, pBluescript SK and KS (+ and -), pNH8a, 
pNH16a, pNH18a, pNH46a (available from Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available 

25 from Pharmacia) Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXT1 , pSG (available from Stratagene) 
pSVK3, pBPV, pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or othe r 
vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial pro- 
moters include lacl, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early HSV 

30 thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate 
vector and promoter is we!! within the level of ordinary skill in the art 

The present invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

35 eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, 
for instance, Davis, L. etal., BASIC METHODS IN MOLECULAR BIOLOGY (1986). 

jo A host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is 

^5 intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence 

Preferred nucleic acid fragments of the present invention are the ORFs depicted in Tables 2 and 3 which encode 
proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 
so of the present invention At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers This is particularly useful in producing small peptides and fragments of larger polypeptides 



. ':.:c^s to isolate ana purity polypeptides or proteins ot the present invention pfoaucoo n^turaiiy cy n ^actenn, sim • 
cr by other methods Methods for isolation and purification that can be employed in this regard include but are not 
limited to immunochromatography. HPLC, size-exclusion chromatography, ion-exchange chromatography, and immu- 
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no-affinity chromatography 

The polypeptides and proteins of the present invention aiso can be purified from ceils which have been altered to 
express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide 
or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally 
does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt pro- 
cedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells 
in order to generate a cell which produces one of the polypeptides or proteins of the present invention. 

Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, 
but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic 
host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 

"Recombinant," as used herein, means that a polypeptide or protein is derived from recombinant (e.g., microbial 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal (e.g., yeast) expression systems. As a product, "recombinant microbiaPdefines a polypeptide or protein essen- 
tially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells. 

"Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 
genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral operon. 

"Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly 
of (1 ) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancers and a polyadenylation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the 
beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast 
or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated 
protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, 
it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional 
unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokary- 
otic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived 
from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook etal., MOLECULAR CLONING:A LABORATORY MANUAL, 2 nd Edi- 
tion, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby 
incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 
transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable 
of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- 
teristics, e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 
functional promoter The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensure maintenance of the vector and, when desirable, provide amplification within the host. 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, E. coli, B. subtilis, Salmo- 
nella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus. Others 
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may, also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
5 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, 
Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence 
to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or 
10 chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product 
Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23: 175 
is (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

25 cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces- 
sary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic antigens and/or immunoprotective vaccines, collectively "immunologically useful polypeptides". Such im- 

30 munologicalty useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein. The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 

35 logically useful. According to Izard, J. W. et al., Mol. Microbiol. 13, 765-773; (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 

40 and -3 positions. 

Also known in the an is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M S and Lory S.. J Bac- 
terid. 174, 7345-7351; 1992)) These are typically six to eight ammo acids with a net basic charge followed by an 

^5 additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type IV signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site. 

Studies of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 

so contained the sequence L-(A,S)-(G,A)-C at positions -3 to +1, relative to the point of cleavage (Hayashi, S and Wu, 
H C Lipoproteins in bacteria. J Bioenerg. Biomembr. 22, 451-471; 1990) 
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ammed. The ammo acid sequence of this region is L-P-X-T-G-X, where X is any ammo acid 

Amino acid sequence similarities to proteins of known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall. Such proteins are well known in the art and include "lipoprotein", "periplasmic", or "antigen". 

An algorithm for selecting antigenic and immunogenic Staphylococcus aureus polypeptides including the foregoing 
criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORFs which are predicted to be outermem- 
brane-associated proteins. These proteins are identified in Table 4, below, and shown in the Sequence Listing as SEQ 
I D NOS: 5, 1 92 to 5,255. Thus the amino acid sequence of each of several antigen ic Staphylococcus aureus polypeptides 
listed in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing. Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID in Tables 1 , 2, or 3, and finding the corresponding nucleotide sequence in the sequence listing. 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 
in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods. 
As a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5,1 92-5,255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general, 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal signal 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 
amino acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4, and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5,1 92-5,255, may obtain the complete predicted amino 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 
ORF listed in Tables 1 ,2 and 3 and found in the sequence listing. 

Accordingly, polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 
ammo acid sequences shown in the sequence listing as SEQ ID NOS:5,1 91 -5,255, or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention. Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention. 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 
polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4. The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope" is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, for 
instance, Geysen et al., Proc. Natl. Acad. Sci. USA 81:3998- 4002 (1983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. 
See, for instance, Sutcliffe, J. G., Shinnick, T. M., Green, N. and Learner, R. A. (1983) "Antibodies that react with 
predetermined sites on proteins", Science, 219:660-666. Peptides capable of eliciting protein-reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. See, for instance, 
Wilson et al., Cell 37:767-778 (1984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention. Non-limiting examples of antigenic polypeptides or peptides 
that can be used to generate S. aureus specific antibodies include: a polypeptide comprising peptides shown in Table 
4 below. These polypeptide fragments have been determined to bear antigenic epitopes of indicated S aureus proteins 
by the analysis of the Jameson-Wolf antigenic index, a representative sample of which is shown in Figure 3. 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means. 
See, e.g., Houghten, R. A. (1985) General method for the rapid solid-phase synthesis of large numbers of peptides: 



EP0 786 519 A2 



specificity of antigen-antibody interaction at the level of individual amino acids Proc Natl. Acad Sci USA 82: 
5131-5135; this "Simultaneous Multiple Peptide Synthesis (SMPS)' process is further described in U.S. Patent No 
4.631,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See, for instance, Sutcliffe et al., supra; Wilson et al., supra; 

s Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F J. et al., J. Gen. Virol. 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e. , those parts of a protein that elicit an antibody response 
when the whole protein is the immunogen, are identified according to methods known in the art. See, for instance, 
Geysen et al.. supra. Further still U.S. Patent No. 5 ; 194 ; 392toGeysen (1 990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 

io epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
interest. More generally, U.S. Patent No. 4,433,092 to Geysen (1 989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding 
site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear CI -C7-alkyl peralkylated oligopeptides and sets and libraries of such 

75 peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above. Also listed are epitopes or "antigenic regions" of each of the 

20 identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first ammo acid in the open reading frame included within the epitope and y is the number of the last amino acid 
in the open reading frame included within the epitope. For example, the first epitope in ORF 168-6 is comprised of 
amino acids 36 to 45 of SEQ ID NO:5,192, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 

25 polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4. The invention 
further provides polynucleotides encoding such polypeptides. 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid 
and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 

30 substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 

35 fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or protein of Staphylococcus aureus is defined as a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the 

-to sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybrid- 
ization, one skilled in the art can obtain homologs 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (amino acid or nucleic acid) homology Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among 

^5 especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology The most preferred homologs among these are those with 99 9% homology or more It will be understood 
that, among measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1 -5, 1 91 or from 

50 a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
ID NOS 1 -5,191 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing 
-i-nnH hm a onrod'no a homolon Methods suitable to this aspect of the present invention a^e well know" -nd h^ v ^ 



• ,.: .,t..v;^e.".^e e/ ,u .\Ob ; a '■ 4 ■ one smigu -n ,hu n,i recognize luni oy employing niyri b^inyc'.^; 

.■editions [c g annealing at 50-60°C m 6X SSPC and 50% tormamide and washing at 50- 65°C in 0 5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified By employing lower stringency 
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conditions {e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), 
sequences which are greater than 40-50% homologous to the primer will also be amplified. 

When using DNA probes derived from SEQ ID NOS:1 -5, 191, or from a nucleotide sequence having an aforemen- 
tioned identity to a sequence of SEQ ID NOS: 1-5,1 91 , for colony/plaque hybridization, one skilled in the art will recog- 
nize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide. and 
washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the probe 
can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 
40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained. 

Any organism can be used as the source for homologs of the present invention so long as the organism naturally 
expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs 
are bacterias which are closely related to Staphylococcus aureus. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 
industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one 
skilled in the art to use the Staphylococcus aureus ORFs in a manner similar to the known type of sequences for which 
the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A 
variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial 
use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., Mac- 
millan Publications, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper etai, Eds., Elsevier 
Science Publishers, Amsterdam, The Netherlands (1985). A variety of exemplary uses that illustrate this and similar 
aspects of the present invention are discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 
macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes en- 
zymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in 
amino acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, 
can be used for industrial biosynthesis. 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS: 1-5, 191. 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase. 

Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 
number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification 
and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
etal, Symbiosis 21: 79 (1986) and Voragen etai. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY Whitak- 
er ef a/., Eds., American Chemical Society Symposium Series 389: 93 (1989) . 

The metabolism of sugars is an important aspect of the primary metabolism of Staphylococcus aureus. Enzymes 
involved in the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 
oxidases which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial production of ascorbic 
acid using the Reichstein's procedure, as described in Krueger ef al., Biotechnology 6(A), Rhine etal., Eds., Verlag 
Press, Weinheim, Germany (1984). 

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized 
form for the deoxygenation of beer. See, for instance, Hartmeir etal, Biotechnology Letters V. 21 (1979). The most 
important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, 
for example, in Bigelis etal., beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al., Eds . 
Academic Press, New York (1985). In addition to industrial applications, GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
cellulose hydrosylates. This application is described inOwusu etal., Biochem. et Biophysica. Acta 872 83 (1986), for 
instance 

The mam sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field 

5 of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger etal., Biotechnology, The Textbook of 
Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988). 

w Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman etal., Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and 
Godfrey etal., Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner etal., Report Industrial En- 

15 zymes by 1990, Hel Hepner & Associates, London (1986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by for 
instance, Macrae et al. } Philosophical Transactions of the Chiral Society of London 31 0:227 (1 985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A mapr use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application 

20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

25 involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et al., Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemists:hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, 

30 and carbon bond forming reactions such as the aldol reaction. 

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud et al, Chemistry in Britain 

35 (1987), p. 127. 

Ammo transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of amino transferases for amino acid production is provided by Roselle-David, Methods 
40 of EnzvmolOQV 136:479 (1 987). 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus These include Sau3A and Sau96l 

45 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be 
50 cither monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 

. . ,j~.v. >r - -: V ki-h pr nr J[. r p fHn^o nntihnrjioc A H v b ri domR 'R ■mmo'laH /od rpH linn wnirh ic ^anablo nf ^nr-rntino 



M;iu ; iAiOHY : LCnNiGuES iN blOCHLMiS TRV AND MOlLC^lAm diOLUUY tisovior science * wOnsnerfc /v 
-:tcrdam The Netherlands ( 1 984). St Grothefa/,J Immunol Methods 35 1-21(1980) Kohlcr and Milstein Nature 
256 495-497 (1975)). the tnoma technique, the human B- cell hybridoma technique (Kozbor etal. Immunology Today 
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4: 72 (1 983), pgs. 77-96 of Cole et al, in MONOCLONAL ANTIBODIES AND CANCER THERAPY Alan R. Liss, Inc. 
(1985)). 

Any animal (mouse, rabbit, etc.) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal 
injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include, but 
are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces 
an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay western 
blot analysis, or radioimmunoassay (Lutz etal., Exp. Cell Res. 175: 109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedu res 
known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984)). 

Techniques described for the production of single chain antibodies (U. S. Patent 4,946,778) can be adapted to 
produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for 
the presence of antibodies with the desired specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in detectably labelled form. Antibodies can 
be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art, for example see 
Sternberger etal., J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. et al, Meth. Enzym. 62:308 (1979); Engval, 
E. etal, Immunol. 109:129 (1972); Goding, J W. J. Immunol Meth. 13:215 (1976)). 

The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells 
or tissues in which a fragment of the Staphylococcus aureus genome is expressed. 

The present invention further provides the above-described antibodies immobilized on a solid support. Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
are well known in the art (Weir, D. M. et al, "Handbook of Experimental Immunology" 4th Ed., Blackwetl Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. etal, Meth. Enzym. 34 Academic Press, N. Y (1974)). 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
immunoaffinity purification of the proteins of the present invention. 



40 3. Diagnostic Assays and Kits 



The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs.antigens or antibodies of the present invention. 

In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 
the DFs, or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF, antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification 
or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present in- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. etal, Techniques in 
Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291, and 
Molecular Biology, Elsevier Science Publisher Amsterdam, The Netherlands (1985), all of which are hereby incorpo- 
rated herein by reference. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum, plasma, or urine. The test sample used in the above-described method will vary based 
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on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 

s out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
which comprises:(a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the fo!!owing:wash reagents, reagents capable of detecting 
presence of a bound DF, antigen or antibody. 

w in detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers Such 

containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test sample, a container which 

is contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tns-buffers, etc.), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alterna- 
tive, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody. One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 

20 present invention can be readily incorporated into one of the established kit formats which are well known in the art. 



4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
25 and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein described. 
In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
30 fragment of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment. 



The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
35 using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
selected or designed" when the agent is chosen based on the configuration of the particular protein. For example, one 
jo skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby ©fa/., Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W H Freeman. NY (1 992), pp 289-307, and Kaspczak etai, Biochemistry 28:9230-8 (1 989), or pharmaceutical 
agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 
control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, 
such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. 
so One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can 

- ' • • prMyn^nrir Hnnvat'vns whirh have hasc attachment capacity 
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Raton Fl (1 988)) Triple hehx-formation optimally results in a shut-off of RNA transcription from DNA, while antisense 
RNA Hybridization blocks translation of an mRNA molecule into polypeptide Both techniques have been demonstrated 
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to be effective in model systems. Information contained in the sequences of the present invention can be used to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or path- 
ogenicity of Staphylococcus aureus, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions. As used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical 
agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism, in vivo or in vitro, " when the agent reduces the rate of growth, rate of division, or viability of the 
organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 
of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, while other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 
of the present invention and serve as a vaccine. The development and use of vaccines derived from membrane asso- 
ciated polypeptides are well known in the art. The inventors have identified particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccines. Such immunogenic polypeptides are described above and sum- 
marized in Table 4, below. 

As used herein, a "related organism" is a broad term which refers to any organism whose growth or pathogenicity 
can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will 
contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal 
routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 
and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most 
cases, the dosage is from about O.t mg/kg to about 10 g/kg body weight daily taking into account the routes of ad- 
ministration, symptoms, etc. 

The agents of the present invention can be used in native form or can be modified to form a chemical derivative. 
As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 
half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, 
REMINGTON'S PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability biological half- 
life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 
also may be effected in this way and can be assayed by methods well known to the skilled artisan. 

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient 
by any suitable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the 
preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single 
or multiple injections. 

In providing a patient with one of the agents of the present invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 
1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered. The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be administered "in combination" with each other 
when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, or following the administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism. 

5 The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose. 

When provided prophylactically, the agent(s) are provided in advance of any symptoms indicative of the organisms 
growth The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of 
any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

w toms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharma- 
ceutical^ acceptable form and in a therapeutically effective concentration. A composition is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiolog- 

75 ically significant if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods to prepare pharmaceutically 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e. 
g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., 

20 Osol, A., Ed., Mack Publishing, Easton PA (1980). In order to form a pharmaceutically acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcel- 
luiose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release. Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino 
acids, hydrogcls, po!y(!actic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-micro- 
capsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, 

35 liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such tech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention. Associated with such container(s) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

-to or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration 

In addition the agents of the present invention may be employed in conjunction with othertherapeutic compounds 
6. Shot-Gun Approach to Megabase DNA Sequencing 

45 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow. The examples 
50 nrc provided by way of illustration Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present disclosure 
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ILLUSTRATIVE EXAMPLES 
LIBRARIES AND SEQUENCING 
5 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman, Genomics 2: 231 (1 988)) application of the equation for the Poisson distribution. According 
to this treatment, the probability P 0 , that any given base in a sequence of size L, in nucleotides, is not sequenced after 
10 a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation P 0 
= e' m , where m is L/n, the fold coverage." For instance, for a genome of 2.8 Mb, m=1 when 2.8 Mb of sequence has 
been randomly generated (1X coverage) At that point, P 0 = e" 1 = 0.37. The probability that any given base has not 
been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, 
therefore, is equivilent to the fraction of the whole sequence that has yet to be determined. Thus, at one-fold coverage, 
75 approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced. When 14 Mb of sequence 
has been generated, coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%. 5X coverage 
of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with 
an average sequence read length of 410 bp. 

Similarly the total gap length, G, is determined by the equation G = Le" m , and the average gap size, g, follows the 
20 equation, g = L/n. Thus, 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a poly- 
nucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1988). 

2. Random Library Construction 

25 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required. The following library construction procedure was developed to achieve this end. 

Staphylococcus aureus DNA was prepared by phenol extraction. A mixture containing 600 ug DNA in 3.3 ml of 
300 mM sodium acetate, 10 mM Tris-HCI, 1 mM Na-EDTA, 30% glycerol was sonicated for 1 min. at 0°C in a Branson 

30 Model 450 Sonicator at the lowest energy setting using a 3 mm probe. The sonicated DNA was ethanol precipitated 
and redissolved in 500 ul TE buffer. 

To create blunt-ends, a 1 00 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England BioLabs) for 10 min at 30°C in 200 ul BAL31 buffer . The digested DNA was phenol-extracted, ethanol-pre- 
cipitated, redissolved in 100 ul TE buffer, and then size-fractionated by electrophoresis through a 1.0% tow melting 

3S temperature agarose gel. The section containing DNA fragments 1.6-2.0 kb in size was excised from the gel, and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA. 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector. 

A two-step ligation procedure was used to produce a plasmid library with 97% inserts, of which >99% were single 
inserts. The first ligation mixture (50 ul) contained 2 ug of DNA fragments, 2 ug pUC1 8 DNA (Pharmacia) cut with Smal 

to and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14°C for 4 hr. The ligation mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved in 20 ul TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete bands in a ladder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as insert (i), vector (v), v+i, 
v+2i, v+3i, etc. The portion of the gel containing v+i DNA was excised and the v+i DNA was recovered and resuspended 

45 into 20 ul TE. The v+i DNA then was blunt-ended by T4 polymerase treatment for 5 min. at 37° C in a reaction mixture 
(50 ul) containing the v+i linears, 500 uM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), 
under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+i linears were 
dissolved in 20 ul TE. The final ligation to produce circles was carried out in a 50 ul reaction containing 5 ul of v+i 
linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the following day the reaction mixture was 

so stored at -20°C. 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.coli host cells deficient in all 
recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) were used to prevent rearrangements, 
55 deletions, and loss of clones by restriction. Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. 

Plating was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 ul aliquot of 1.42 M beta- 



EP0 786 519 A2 



mercaptoethanol was added to the aliquot of cells to a final concentration of 25 mM. Cells were incubated on ice for 
10 mm A 1 ul aliquot of the final ligation was added to the cells and incubated on ice for 30 min. The cells were heat 
pulsed for 30 sec. at 42° C and placed back on ice for 2 min. The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation 

5 mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptone, 5 g yeast extract, 0.5 g NaCI, 1 .5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0 4 ml of 50 mg/mi ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal (2%) 1 mIMgCUM M), and 1 ml MgS0 4 /1 00 ml SOB agar The 15 ml top layer was poured just prior to plating. 
Our titer was approximately 100 colonies/10 ul aliquot of transformation. 

10 All colonies were picked for template preparation regardless of size. Thus, only clones lost due to "poison" DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected 

3. Random DNA Sequencing 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with 5Pnme -> 3Prime Inc. (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification. Average template concentration was 
determined by running 25% of the samples on an agarose gel. DNA concentrations were not adjusted. 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library. An unamplified library was 

constructed in Lambda DASH II vector (Stratagene). Staphylococcus aureus DNA (> 100 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, 1X Sau3AI buffer, 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation . One ul of fragments was used with 1 ul of DASHII vector (Stratagene) in 

25 the recommended ligation reaction. One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment). Yield was about 
2.5x10 9 pfu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufactureer's protocol. 

30 The amplified library is stored frozen in 7% dimethylsulfoxide. The phage titer is approximately IxlO 9 pfu/ml. 

Mini-hquid lysates (0 1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3 and T7 primers, and Elongase Supermix (LTI). 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 

35 Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (M13-21) and the M13 reverse (M13RP1) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. All of the dye terminator sequencing reactions are analyzed 

40 using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for M1 3-21 and M13RP1 
sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for M13-21 sequences, 
445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions. 

45 4. Protocol for Automated Cycle Sequencing 

Tho sequencing was carried out using Hamilton Microstation 2200, Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers. The Hamilton combines pre-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescentty-labelled sequencing 
so primers, and reaction buffer Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler Thirty consecutive cycles of linear amplification (i.e.., one 

~r,™pr q^nthoQiO <;tppc; vvprp performed incM JfiinQ donaturation qnnnqli^O of pr\mpr qprj tpnnpiqtp ^nrj oytonqi^n ; 0 



" • ' ' o snotgun sequencing involves use of tour dye-iaoenea sequencing primers, one for eacn ot tnc !ou' 

terminator nucleotide Each dye-primer was labelled with a different fluorescent dye. permitting the four individual 
reactions to be combined into one lane of the 373 or 377 DNA Sequencer for electrophoresis, detection, and base- 
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calling. ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing. Sequencing can be done with both plasmid and PCR-generated templates with both dye- 
primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable 
sequences. 

Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 
per day. Electrophoresis was run overnight (ABI 373) or for 2 1/2 hours (ABI 377) following the manufacturer's protocols. 
Following electrophoresis and fluorescence detection, the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling. The lane-tracking was confirmed visually. Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality. Trailing sequences of low quality were removed and the sequence 
itself was loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polylinker sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. 

INFORMATICS 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed. (For review 
see, for instance, Kerlavage era/., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System 
Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error. The database stores and correlates all information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 
on a Unix platform, it was necessary to design and implement a variety of multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort. 

2. Assembly 

An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 
fragments was enployed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10 4 fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The 
number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements 
Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith -Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., 
Methods in Enzymolopy 1 64 : 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 
end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information 
coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 

3. Identifying Genes 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf. 
which finds ORFs of a minimum length. The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92.0), using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity. Those ORFs with nucleotide sequence 
matches are shown in Table 1 . The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. 
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ORFs of at least 120 amino acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

1. Production of an Antibody to a Staphylococcus aureus Protein 

Substantia!! 1 ' ^urc ^ rote in or polypeptide is isolated from the transfected or transformed ceils using any one of the 
methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
E coli, or can by chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by 
concentration on an Amicon filter device, to the level of a few microg rams/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybridomas according to the classical method of Kohler, G. and Milstem, C, Nature 256:495 (1975) or modifi- 
cations of the methods thereof. Briefly a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminoptenn (HAT media). The successfully fused 
cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol. 70:41 9 (1 980), and modified meth- 
ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis, L. et al. Basic Methods in Molecular 
Biology Elsevier New York. Section 21-2 (1989). 

3. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species. For example, small molecules tend to be less immunogenic than other and may require the use of 
carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigenadministered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, 
J. etai, J. Clin. Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 
termined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen begins to fall. See, for example, Ouchterlony O. era/., Chap. 19 m:Handbook of Experimental Immunology 
Wier D ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0 1 to 0 2 mg/ml of serum 
(about 1 2M) Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
lor example, by Fisher, D , Chap 42 m Manualof Clinical Immunology, second edition, Rose and Friedman, eds , Amer 
Soc For Microbiology, Washington, D C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples, they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they are useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 
the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapoutic 
reagent 



:,u .soa >r, accordance with the present invention to prepare PGR primers tor a variety ot uses ■ he PGR pnmc s 
a f c pTctcraDiy at least 15 bases, and more preferably at least 16 bases in length When selecting a primer sequence 
>s preferred that the primer pairs have approximately the same G/C ratio so that melting Temperatures are approxi- 
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mately the same. The PCR primers and amplified DNA of this Example find use in the Examples that follow. 
4. Gene expression from DNA Sequences Corresponding to ORFs 

A fragment of the Staphylococcus aureus genome provided in Tables 1-3 is introduced into an expression vector 
using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and facilitate 
proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism, as explained by Hatfield etal,, U. S. Patent No. 5,082,767, incorporated herein by this reference 

The following is provided as one exemplary method to generate polypepttde(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 
Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems. pXT1 contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 
aureus DNA and containing restriction endonuclease sequences for Pstl incorporated into the 5 1 primer and Bglll at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from 
the resulting PCR reaction is digested with Pstl, blunt ended with an exonuclease, digested with Bglll, purified and 
ligated to pXT1 , now containing a poly A addition sequence and digested Bglll. 

The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, 
New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St. Louis, Missouri). The protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, 
synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Alternately and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the 
globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 
texts such as Davis etal, cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene, Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 
scope of the invention. 

All patents, patent applications and publications referred to above are hereby incorporated by reference. 
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Table 4 



ORF 


SEQ ID NO 


BLAST 


Antigenic Regions 










HQMULLKj 


Region 1 


Region 2 


P»nion 3 


Reoion 4 


— 

158_6 ; 


5192 


lipoprotein 


36-45 


84-103 




1 76-1 85 


238_1 ' 


5193 


chrA 


21-39 


48-58 




232-249 


b 1 _2 i 


5194 


OppB qene product (B. sub* 


20-36 


70-79 


i ofi-1 1 7 


121-131 


278_3 > 


5195 


lipoprotein 1 


20-29 


59-73 




162-171 


276_2 


51 96 


lipoprotein 


21-33 


65-74 


1 77-1 Rfi 

iff 1 OD 


21 1 -220 


45_4 


5197 


ProX 


28-37 


59-69 


OJ 1 Uw " 




315.8 


5198 


hypothetical protein 


45-54 


88-97 


1 ft ?-1 Q? 




f 1 54_1 5 


5199 


unknown 


31-40 


48-58 




9^-1 H4 


228_3 


5200 


! unknown I 


25-38 


I 40-52 


C4-74 I 




228_6 1 


S201 


1 unknown i 


29-41 


! 89-101 


19R-143 1 


1 73-1 fl4 


50.1 


5202 


unknown 


21-33 


52-61 


1 DO" 1 OC 


i Q7.?nK 
1 j / ~C\JO 


112_7 


5203 


iron-binding periplasmic 


21-31 


i 58-67 


3(1-1 U 1 


111 1 9n 


442.1 


5204 


• < 
unknown 


30-39 


j 91-100 


1 ??-1 ^7 


1 RP-1Q? 

1 OC- 1 


66_2 


5205 


unknown 


50-59 


104-116 




1 P.7-1 ft? 
1 D/ " 1 Ol 


304_2 


5206 


; 0-bindinq periplasmic ; 


19-28 


! 48-57 


f J 0_7 


1 UO 1 1 D 




5207 


' hypothetical protein 


27-36 


1 86-95 


1 ?Q.1 3R 

1 C. J I JO 


1 Q7-701 




5208 


SphX 


27-44 


| 149-161 


ICC 17t i 
I DO 1 r J 




! 46_5 


5209 


cmpC (permease) 


21-33 


1 61-70 


R"*.Q? 


i nn-1 HQ 


942_1 


5210 


traH [Plasmid pSK41] 


83-92 


1 109-118 






5_4 


521 1 


ORF (S. aureus) 


12-22 


87-96 


1 l 1 - 1 cu 


1 j 1 1 DU 


20_4 


5212 


peptidoglycan hydrolase (S: 


24-34 


129-138 


i4i-i 


1 fil -1 71 


328.2 


5213 


! lipoprotein (H. flu) i 


81-90 


123-133 


£3U-t33 




520_2 


5214 


i fibronectin binding protein : 


44-54 


63-79 


PI QH 




771_1 


5215 


^mml qene product (S. py< 


30-39 


65-82 




1 1 ?-1 ?1 


999.1 


5216 


predicted trithorax prot. (D 


7-16 


1 20-1 29 


j 1 J/*l DO 




853_1 


5217 


ORF2136 (Marchantia polyr 


43-52 


88-97 






287_1 


5218 


psaA homolog 


13-22 


28-44 


1 77-R? 


1 1 4-1 24 


288_2 


5219 


cell wall enzyme 


14-23 


89-98 






596_2 


5220 


penicillin binding protein 2b 


40-49 


59-68 




1 06-1 1 5 


217.5 


i 5221 


fibronectin/fibrinogen bindii 


28-37 


40-49 


i 62-71 


93-1 1 1 


217_6 


I 5222 


fibronectin/ fibrinogen bp 


10-19 


31-40 


i ^4-fi? 


73-92 


528.3 


1 5223 


myosin cross reactive prot€ 


4-13 


29-47 


I CH-7^ 
I 0 U- r j 




171_11 


i 5224 


EF 


20-31 


91-110 


! 
I 




63_4 


5225 


penicillin binding protein 2b 


12-21 


i 59-68 


t Q ^ 1 04 


[ — 

: — 


353_2 


5226 




46-55 


I 62-71 






74311 


5227 


29 kDa protein in fimA regi* 


23-32 


68-79 




17^-1ft4 


342_4 


! 5228 


Twitching motility 


10-19 


48-60 


I QO AO 


11115 1 


69_3 


coon 

i 5229 


araDinogaiacian proiein 


97-106 


1 32-141 


1 58-1 67 


1 80-189 


70_6 




nodulin 


or AC 


4ft-^7 

HO 0 r 


137-160 


1 79-188 






nlv/rprol diester ohosohodie 


8-17 


41-50 


55-74 


97-106 


58_5 


r *> o i 

5232 


PBP (S. aureus) 


26-35 


70-79 


1 17-126 


1 52-161 


1 88_3 


5233 


MHC class II analog (S. aure 


72-81 


94-103 


115-124 


136-145 


236.6 


5234 


histidine kinase domain (Die 


24-33 


52-67 


81-94 


106-121 


310.8 


5235 


clumping factor (S. aureus) 


59-71 


77-86 


93-102 


118-127 


601_1 


5236 


novel antigen/0RF2 (S. aui 


45-54 


91-104 


108-117 


186-195 


544.3 


5237 


ORF YJR1 51c (S. cerevisae; 


76-90 


101-111 


131-140 


154-164 


662_1 


5238 


MHC class II analog (S. aure 


22-32 


71-80 


89-98 


114-122 


87_7 


5239 _ 


5' nucleotidase precursor (* 


29-45 


62-71 


105-114 


125-137 






PfiSG qene product (B. sub 


102-111 
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Table 4 



ORF 




Antigenic i Regions 


(cont) 








Region 5 


Region 6 


Region 7 


Region 8 


Region 9 


Region 1 0 


168.6 


244-272 


303-315 










238.1 


260-269 


291-301 


308-317 








51_2 


140-152 


188-208 


21 1-220 


256-266 


273-283 




278_3 


198-209 












276.2 


255-268 












45,4 


177-199 


221-230 


234-243 


268-279 


284-293 


304-313 


316.8 














154_15 


148-157 


177-187 


202-211 


i 


228_3 


101-119 . 


139-154 


166-181 






228_6 ! 








50_1 I 








112.7 


136-149 


197-211 


21 8-229 


2S3-273 • 


442_1 


199-210 : 


247-257 


264-277 


287-309 1 i 


66_2 i 






1 


304.2 


178-187 


250-259 










44.1 ! 










161.4 | I 


1 




46.5 


131-141 


162-176 


206-215 


243-252 


264-273 


• 285-294 


942_1 












5.4 


189-205 


230-239 


246-264 


301-318 


340-354 


378-387 


20.4 


202-212 


217-234 


260-275 


314-336 


366-373 


380-391 


328_2 ! 




1 




520_2 1 




i 




771_1 


145-154 






1 




999.1 1 






853.1 ! 




! 

* 




287.1 


1 54-164 




! ! 1 


288 2 i 1 1 


596.2 


121-130 




1 : I 


217.5 


244-253 


259-268 


288-297 


1 302-31 1 






217.6 


144-158 


174-183 


188-197 


j 207-216 


226-242 




528 3 t I 


171_11 ! ! 


63_4 t I I 






353.2 ! 


1 




743L1 


1 97-207 




I 


1 






34-2 4 : I 


69.3 


195-21 1 






i 






70.6 


206-215 


263-272 


291-301 


1 331-340 


358-371 


390-414 


129.2 


117-127 


141-157 


168-183 


202-211 


222-231 


261-270 


58.5 


184-203 


260-269 


275-299 


330-344 


372-381 


424-433 


188_3 


236.6 


138-147 


163-172 


187-198 


! 244-261 


268-278 


308-317 


310_8 


131-140 


144-153 


177-186 


; 190-199 


204-213 


216-227 


601.1 


' 208-218 












544.3 


170-179 


184-193 


' 224-235 


274-287 


327-336 


352-361 


662_1 














87.7 








! 






120.1 
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Table 4 



ORF 




Antigenic 


Regions 


(cont) 








Region 1 1 


Region 1 2 


Region 1 3 


Region 14 


Region 1 5 


Region 1 6 


168.6 I I I 


238_1 I ! 


5 1 _2 














278.3 




- - - - i 










276,2 














45.4 














. JLliLJL_, 














154_15 








i 




228_3 I 








228_6 I i 








50.1 








i 




1 1 2_7 1 ' 1 






.; 




442_1 














66.2 






i 




3_04_2^ 












44_1 










i 




1 61 _4 












46_5 


306-315 










942.1 1 








5_4 


393-407 


416-426 


456-465 






20_4 


396-405 


410-419 1 


461-481 






328_2 i 1 




1 




520_2 






i 
i 




771.1 I 








999_1 ! 










853.1 : 










287.1 














288.2 ! i I 


596.2 














217.5 1 1 






217.6 






528.3 i 




1 

. — i 


171.11 ; 




1 


63_4 








1 


353_2 ! 




I 


743-.1 1 i 


342.4 






1 








69.3 














70.6 


453-471 


506-51 5 










129_2 


, 296-31 5 












58.5 






i . _ 








188.3 






1 


i 






236_6 


358~377 


410-423 


I 428-439 


442-457 


467-476 


480-493 


310.8 


Z38-251 


256-275 


- 281-290 


'296-310 


314-333 


3^8-347 



601 _1 
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Table 4 



ORF 




Antigenic 


Regions 


(cont) _ 








Region 1 7 


Region 18 


Region 19 


Region 20 


Region 21 


„. _ 
Region 22 


168_6 














238_1 








i 




. - .— ... 


51_2 












... 

- 


278_3 i ! 


„ 


276_2 ; i 1 




45_4 


i 












31S_8 














154_15 


i 
i 






t 






228_3 




! 






228_6 








50_1 








112.7 I 


; 






442.1 ] 


i 

i 






66_2 ! 


i 


i 


304_2 ! 








44_i ; 


i 


! 


161_4 I 1 ' ! 1 


45_5 


-i 


i 








942_1 














5_4 


i 












20_4 ! 










328_2 I 


i 






520.2 1 i 






771 _1 


! 

-i 












999_1 


1 






: 






853_1 


1 
i 






■ 






287_1 














288_2 1 ! ! 


596_2 


4- 












217_5 














217_6 














528_3 I I 


171_11 1 < 1 


63_4 ; 1 1 


353_2 ! i ' I 


74311 




342_4 


t 










69_3 






70_6 






129_2 






58_5 




1 88_3 




236_6 














310_8 


357-366 


370-379 


429-438 


443-452 


478-487 551-560 


601_1 














S44_3 














662_1 




























1 20_1 
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Table 4 



ORF 




Antigenic 


Regions 


(cont) 








Region 23 


Region 24 


Region 25 




Ppninn 27 


Region 28 


168.6 














238_1 














51_2 














278_3 








276_2 




- 










45_4 














31 6_8 














1 54_15 














228_3 1 


; 1 


228_6 




i 


i 




50.1 








i 




1 1 2__7 i 




i 




442_1 










66_2 i 




1 




_ 304_Z 












~44_1 
1 61 _4 




■ 










46.5 














942.1 






1 


■ 






5 i : ! ! 




20_4 












328_2 ! 




1 




520_2 i 




1 




771 1 i 








999,1 












853,1 










1 


287_1 














288_2 














596_2 














217_5 














217_6 














528 3 ' ! 




171 11 ! 1 




63 4 : 1 




353 2 1 i ! 




743_1 _ 














342~_4 














69^3 














70.6 














"l29_2 














58.5 














188_3 














236_6 












i 


310_8 


622-632 


670-685 


708-718 


823-836 


858-867 


877-886 


601 1 


<44 ^ 
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Table 4 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



ORF 



Ant ige nic Regions 

Region 29 Region 30 



(cont) 



168. 6 
238_1 



51_2 



278_3 
276_2 



45_4 

31 6_3 
1 54_1 5 



228_ 3 
228~6 



1 1 2_7 



442_1 



66_2 



304_2 



_44_J 

J 11=1 
46_5 
942_1 



5_4 



20_4 



328,2 



520_2 



771_1 



999_1 



853_1 



287_1 



288=2 



596,2 



217_5 
217,6 



S28_3 



171.11 



63_4 



353_2 



74£J 



342_4 



69_3 



70_6 



129_2 



58.5 



188.3 



236.6 



310.8 
_601 _1 
S44_3" 



662.1 
1 20.1 
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Table 4 



ORF 




BLAST 


Antigenic! Regions 











HOMOLOG 


Region 1 


Region 2 


_Region 3 


Region 4 


46_i 


5241 


aldehyde dehydrogenase 


8-17 


36^2 _ 


J3-96 


11 2-121 


_ 63___4_ 


5242 


glycerol ester hydrolase (P. 


__"__9:26 


57-73^ 


93-107 


1 23-1 33 






5243 ketopantoate hydroxyrrieth 


71-80 


~~203^212_ 


OAO OCA 


?CC 0~7 A 
C.OD-C. f H 


206_1 6 5244 


ornithine acetyltransferase 


TT6 


34-43 


54-63 


194-210 


267.1 


,5245 


NaH-antiporter protein (E. r 


120-129 


332-347 




_ 


322_1 


i5246 


acn flavin resistance protein 


58-75 


153-164 


203-231 j 


264-284 


415.2 


5247 


transport ATP-binding prot< 


108-126 


218-227 


298-308 , 


31 5-334 


214_3 


^5248" 


2-nitropropane dioxygenas* 


123-136 


216-233 




?Q7 one 


587.3" 


^5249 


clumping factor 


5-14 


43-54 


59-68 j 


76-95 


6~85_1 


;5250 


signal peptidase 


59-68 


72-81 


86-95 j 


J 99-108 


54_3 


15251 


fibronectin binding protein 1 


23-32 


37-46 


50-59 


89-98 


54_4 


" 5252 


fibronectin binding protein 1 


43-52 


66-75 


__95-J_04 ' \ 


J 47-1 56 


54_5 


! 5253 


fibronectin binding protein I 


49-60 


81-90 ' 






54.6 


] 5254 


fibronectin binding protein 1 


55-71 


82-97 


139-158 j 


~!75 : 186^ 


328_1 


'5255 


lipoprotein (H. flu) 


11-20 


61-70 


" 96-105 i 
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Antigenic; Regions ! 


(cont) i 


l 

! 






Region 5 


Region 6 1 


Region 7 | 


Region 8 ! 


Region 9 


Region 10 


46_1 


215-242 


333-352 ! 


376-385 1 


416-432 ; 


471-487 




63_4 


145-154 ; 


191-202 ! 


212-223 ! 


245-265 : 


274-283 


291-300 


174_6 






i 








206_16 


239 : 25_9_; 


275-284 ; 


1 


i 






267,1 




1 


1 


i 






322.1 


298-319 


350-359 ; 


j 


1 






415_2 


344-3_53 


3 71-380, 


395-404 

* 


456-465 


486-495 


51.8-S27~_ 


214.3 


318-337 


"365-375 ' 










"53>j3 ' 


1 06-1 15 


142-151 I 


156-166 


173-182 


186-198 


204-213 


685.1 l_ 
54_3 ; 


113-122 


130-145 1 1 






128-138 


185-194 1 


217-226 


251-260 


268-277 


" 295-305 


54_4 ^_ 


175-188 


191-200 


203-212 


220-229 






~54_5 \_ 












i 


54_6 1 


_ 220 JL 23p___ 


287-304 


317-326 


344-353 


364-373 


| 378^387_ _ 


328_1 1 




i 
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Table 4 



ORF 




Antigenic 


Regions 


i(cont) 








Region 1 1 


Region 12 


Region 1 3 




Region 14 


Region 1 5 


Region 17 


46_1 












63.4 


306-315 


319-328 


366-376 




395-420 


453 : 462 _ 


467-47 6 


174_6 
















206.J6 








i 

j 




. 


267.1 


\ 














322.1 
















415.2 


539-555 














214^3 




1 










i 


587.3 


1 217-226 _ 


' 278-287 


318-327 




'332-342 


1351-360 


J 3 77^ 8 6 


685.1 
54^3 


7" 31 6^325" 




355-372 




r " 

387-396 


~'4?6-425 


i 

^ 438;448 


54_4 
















54.5 ! ! 






"1s69~578~ " 




54.6 


! 396-407 


■ 427-436 


514-531 




541-550 


! 6 1 2-62~2 


328.1 
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ORF 




Antigenic, 


Regions 


;(cont) 








Region 18 


Region 1 9 


Region 20 


7 

I 


Region 21 


Region 22 


Region 23 


46.1 




i 




.... 








63.4 


485-500 


513-525 












174~_6 














206_16 
















267.1 














322.1 
















415.2 
















214_3 












i 




587.3 


396-405 


T426~442 i 


459-470 




485-494 


! 505-51 4 


" 53T-562 


685.1 




l ; 












54.3 


455-462 


; 472-491 


517-536 








i 


54_4 














..!„._. 


54 5 I ! I 




I 


i 


54_6 


639-648 


'673-681 i 


703-715 




723-732 


J 749-760 _ 


; 772-788 


328.1 
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Table 4 


ORF 




Antigenic 


Regions 


(cont) 






46_1 


Region 24 


Region 25 


Region 26 


_ Region_27_ 


_ Region 28 


Region 29 


63_4 














174_6 














206_1 6 














267_1 


-- - 






, 






322_1 














4~lT_2 


* 


■ 










2T4_3 














587_3 


567-578 


■ 5_84-60] 


.607-840 


844-854 


858-870 


87 7^88 6~ ~ 


685_1 




1 










54_3 




- 1- " 










54_4 














54„5 














54_6 


793-802 


j 81 1 -826 


834-848 


866-876 


893-903 


907-918 


5Z6_) 















Table 4 

25 



ORF 1 Antigenic Regions 


(cont) 


Region 30 Region 31 




46_1 




63_4 




174_6 


206_16 j 


2S7_1 | : 


322_1 ; 


41 5_2 




214_3 i i i 


587_3 889-911 .927-936 




685_1 I 




" 54_3 ' ' I 1 




54_4 i 1 




54_5 ! i i 


54_6 925-944 951-997 




328_1 1 1 





45 
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SEQUENCE LISTING 



5 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: US 

(F) POSTAL CODE: 20850 

(ii) TITLE OF INVENTION: Staphylococcus aureus Poly- 



nucleotides and Sequences 



20 



(iii) 



NUMBER OF SEQUENCES: 5255 



25 



(V) 



COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1 . 4 Mb storage 

(B) COMPUTER: HP Vectra 486/3 3 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 



30 



(vi) 



CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 



35 



(vii) 



PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05-JAN-1996 



40 



(2) INFORMATION FOR SEQ ID NO : 1 : 



45 



50 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5895 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 



15 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TCCATTATGA AGTCACAAGT ACTATAAGCT GCGATGTTAC CAATGTTTTT TAAAATCCCA 60 

GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 120 

aaataGGTAA TAATGTAATA GCTTCTATTA TGATGCCTAA TTGAATGAAT TGGGCAAATG 180 

GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC G CAT AAT ATT 240 

TTTTTCGTTT AATAAGTCGC ACAGGAATGG GCTTCTTTTT AGTTGCTGCA GG AG CAT AT A 3 00 

CTCACATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 3 60 

AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 4 20 

AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 4 80 

TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 540 

TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 600 

CTAAGTTATT TCTCTTTTGA AGATACGTGG CAAACTGGTC AATTTTATTA TCAAAATAAT 660 

TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AG T A CAATCT TTTATCATTA 72 0 

TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 7 80 

35 TAACATAAAA AAATTTACAG TTAAGAATAA AAAACGACTA GTTAAGAAAA ATTGGAAAAT 84 0 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 90 0 

TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 

40 

AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 102 0 

TCACTAAAAA ATAAGATGAA TAATTAATTA CTTTCATTGT AAATTTGTTA TCTTCGTATA 10 8 0 

GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 114 0 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 120 0 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAA CAA CATC GATTTATCAT 12 6 0 

so TATTTGATAA ATAAAATTTT TTTCATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 132 0 

~*. -TAAA A^ATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 1380 
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25 
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TCATTTGCAA AGGGCGAAAT GGGTTCTTAC TGAGTTATCT ATTATAAAAA AATAAACATA 1560 

GACTTATGAA AAATCTCTCA TAAATCTATG TTTAGTCATG aCATGTGTTA AATATTATTT 1620 

CGGGCGCTTC TTATTTATAC AAATCTAATT TAATACTTTT AAATACAGGT ATATTTTCgC 16 8 0 

GTTGCTGTTC TACTTCATTT AAGTTTAAAT CTACAGTCAA AATATCTGCG GATTCATTTA 174 0 

ATTCTCCAAC TAAATCTCCA TTTGGGTTTA TAACTATCGA ATGACCAGCA TATTCTGTGT 18 00 

TACCATCGAA TCCAGTGCTA TTAGTTCCAA TGACAAACAT ATTATTTTCA ATTGCACGTG I860 

CCTTTAGTAA TGAATGCCAA TGTTGAAGAC GTGACATAGG CCATTGCGCC ACATAAAATG 1920 

75 CAATTTTAGC ACCACTACGA GCAGGATATC TTAATAATTC TGGAAAACGT AAATCATAAC 1980 

AGATAAGTTG GGTCACATAA GTACCGTCAG ACAATTGAAA GGGTTCAGCT ACGTATTCGC 204 0 

CAGCGGTTAA AAATTCATGC TCTCTTAACA TAGGAACTAA ATGAACTTTG TCGTATTCaT 2100 

TAATCAGCTG GCCACTTTTA TTCACACTAA AAGCTGTATT AAATATTTGA TTGTTT CTAA 2160 

TGTTAGAAAC TGACCCAGCT ACGATATCGA CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

ATGAAAAACT TTGTCCTAGA TTATTATCTG CTTTTTCATT TAAATGCTCT AAATCATAGC 22 80 

CATTATTCCA CATTTCAGGT AAAACGACTA CATCTACTTC AGCATTCATA TTTTTTTCGA 234 0 

ACCATTGCGT TATTTGAGTT TCATTTTTAG AACTATCTCC AAAAACAATC GGTAATTGAT 24 0 0 

AAATTTGGAC TTTCATAACA TCACATCCTT GATAGATCTT ATATATAACT TACTAAAAGT 2460 

TATGTTGAAA CGCAAAAAAC GAGCACAAGA CATAAAATCA AAGTCCTAGG CTCTACAAAG 2 520 

TTATATTGAC AGTAGTTGAT GGGGCCCCAA CATAGAGAAA TTGGAACACC AATTTCTACA 2 580 

35 GACAATGCAA GTTGGGGTGG GCT CTAA CAT AAAGAAATAC TTTTTCTTTA GAAATTAGTA 264 0 

TTTCTTATAC ATGAGTTTTA CTCATGTATT CCTATTCTTA AGTGCACATT AGCAGCGGCT 2700 

AATGTGTAAG AACTACTACA TAATGAATAA CTAATGATTC TTTATCATTT CTGTCCCATT 2760 

CCTAACAATA TATTGATTAT TTTTTTATTA CGAAACGATC TTCCACTGGA TTAAATGTTT 2 82 0 

TTTCGCCAGC AGCTTCACGA ATATCACCAA ATGGCATTTG AGCAATAAGT TTCCAACTTT 2 880 

TAGGAATATT AAATTCATTT GAAGTCATCT CATCAACAAG TGGATTATAG TGTTGTAATG 2 94 0 

AAGCACCTAT GCCTTTAGTA GCTAATGCAG TCCAAATTGC AAATTGATGC ATGGCATTTG 3 00 0 

TTTGAGTTGA CCATATTGCA AAATTATCAT AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3 06 0 

50 TTACAACATC TTGATCTTCA TAAAACAAAA TTGTACCGTA TGAATGTTTG AAGTTATCAA 312 0 

TTTTTTGTTC AGTTGGCTCG AAATCACGAT TCTCTCCCAT GACTTCTTTT AAAATTGCTT 318 0 

TTGTGTTATC CCAAAATTTA TTATTGTTGT CATTTAACAA GAGAACAATT CTAGTTGATT 3 24 0 

55 
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w 



15 



25 



CATCGCTAAT TGATATCGAA TCTTTCAAAT TATATATTGA ACGTCTTTCT TCCATTGCAT 336 0 

TGTCAAAAGT CATTGCTTTT TTATCTTTTT TAAATAAGCC CATAATTATT GCTCCTTCTT 34 20 

TAGTAAAGAA TACTTAATAG ACTAAGTATA AAATTTATAC TCGTACTTGT AAAGCAATAT 34 8 0 

TTACGAAAAT TTCAAGAATA TTAATATTCA TTTTCAAATT CCAAATATAA ATGCATTTTC 3 54 0 

AACGCATATT TATTATACTT AGATTAATAC TTACATGAAA AAGGGAGGTG TCTCGTGAAA 3 6 00 

TGTCATATCA TTGGTTTAAG AAAATGTTAC TTTCAACAAG TATTTTAATT TTAAGTAGTA 3 6 60 

GTAGTTTAGG GCTTGCAACG CACACAGTTG AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 3 720 

CAACTACTAA TTTGAATCAT AATATAACTT CACCATCAGT AAATAGTGAA ATGAATAATA 3780 

ATGAGACTGG GACACCTCAC GAATCAAATC AAACGGGTAA TGAAGGAACA GGTTCGAATA 3 84 0 

GTCGTGATGC TAATCCTGAT TCGAATAATG TGAAGCCAGA CTCAAACAAC CAAAACCCAA 3 900 

20 GTACACATTC AAAACCAGAC CCAAATAACC AAAACTCAAG TCCGAATCCT AAAC CAGATC 3 960 

CAGATAACCC GAAACCAAAA CCGGATCCAA AACCAGACCC AGATAAACCA AAGCCAAATC 4 020 

CGGATCCAAA ACCAGAT CCA GATAACCCGA AACCAAATCC AGATCCAAAA CCAGACCCAG 4 080 

ATAAACCAAA GCCAAATCCG GATCCAAAAC CAGATCCAGA TAAACCAAAG CCAAATCCGA 414 0 

ATCCAAAACC AGACC CTAAT AAGCCAAATC CTAACCCGTC ACCAGATCCC GATCAACCTG 42 00 

GGGATTCCAA TCATTCTGGT GGCTCGAAAA ATGGGGGGAC ATGGAACCCA AATGCTTCAG 4260 

ATGGATCTAA TCAAGGTCAA TGGCAACCAA ATGGGAATCA AGGAAACTCA CAAAATCCTA 4 320 

CTGGTAATGA TTTTGTATCC CAACGATTTT TAGCCTTGGC AAATGGGGCT TACAAGTATA 4 3 80 

35 ATCCGTATAT TTTAAATCAA ATTAATAAGT TGGGCAAAGA TTATGGAGAA GTTACTGATG 44 4 0 

AAGACATTTA TAATATTATT CGAAAACAAa ATTTCAGCGG AAATGCATAT TTAAATGGAT 4500 

TACAACAGCA ATCGAATTAC TTTAGATTCC aATATTTCAA TCCATTGAAA TCAGAAAGGT 4 560 

40 

ACTATCGTAA TTTAGATGAA CAAGTACTCG CATTAATTAC TGGTGAAATT GGATCAATGC 462 0 

CAGATTTGAA AAAGCCCGAA GATAAGCCGG ATTCAAAACA ACGCTCATTT GAACCGCATG 4 6 80 

AAAAAGACGA TTTTACAGTA GTTAAAAAAC AAGAAGATAA TAAGAAAAGT GCGTCAACTG 4 74 0 

CATATAGTAA AAGTTGGCTA GCAATTGTAT GTTCTATGAT GGTGGTATTT TCAATCATGC 4 8 00 

TATTCTTATT TGTAAAGCGA AATAAAAAGA AAAATAAAAA CGAATCACAG CGACGATAAT 4 860 

so CCGTGTGTGA TTCGTTTTTT TTATTATGGA ATAAAAATGT GATATATAAA ATTCGCTTGT 4 92 0 

— "-^TTrAAAGr CTCAGGATTA AGTAATTGGA AT AT AA CG AC AAATCCGTTT 4 980 
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AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 516 0 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 52 2 0 

5 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 52 8 0 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 534 0 

TGATAAATCA TTACCAATGC AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 54 00 

10 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 54 60 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

15 ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC tTGTCATAAT TTTCCTCCAA 564 0 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

20 

GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTATGTTAG CACTCTTTAA 5760 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

AT CGTGTG AT T ATT GAG AAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

25 

TGATAGTGCT AAAGA 58 95 
(2) INFORMATION FOR SEQ ID NO: 2: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTTGAAAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT 6 0 

40 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 12 0 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 18 0 

ACCTTCAAGT ACTATGCGTA CGCCAGATGA TATGAAGTCA TTTTTGACGA AAGACCAATA 24 0 

45 

CCGATTATAC AAATTAATTT GGGAACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 3 00 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

50 AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 4 20 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 4 80 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 54 0 
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AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCGTTTTGTT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTT CCC AGAGATTATT GATGTGGAAT TCACAGTGAA 72 0 

5 

TATGGAAACG TTACTTGATA AG ATTG CAGA AGGCGACATT ACATGGAGGA AAGTAATCGA 7 80 

CGGTTTCTTT AGTAGCTTTA AACAAGATGT TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 84 0 

w TGAAATCAAA GATGAGCCAG CCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT 9 00 

AAAAATGGGA CGCTATGGTA AGTTCATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

r5 AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 1080 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AATATCTTGT 114 0 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCGATT ATAAAGAGGC 1200 

20 

AGCGCAGAAA TAATATTTTT ATTTCCTACA TACATTTTAA GATTGTT.AAA TAGAATCATT 1260 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 13 20 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 13 80 

25 

GTAAATGTAA TAGGTGCTGG TCTTGCCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 144 0 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTGTTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGG TGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 162 0 

AAGG CACGAG TTCCAGCTGG TGGTGCATTA GCAGTTGATA GACACGATTT TTCAGGTTAT 16 80 

35 ATTACTGAAA CACTTAAAAA TCATGAAAAT ATCACAGTTA TTAATGAAGA AATTAATGCC 174 0 

ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCGCAA 1800 

GAAATAGTGG ACATTACTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

40 

ATTGAAAAAG AATCTATTGA TATGGATAAA GTTTACTTAA AGTCCCGTTA TGATAAAGGT 192 0 

GAAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTTCGAGGGT 2 04 0 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACGCAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA TCCAAAGACT GGGAAACGTC CTTATGCGGT GGTTCAATTA 2160 

50 AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 222 0 
.,..^, A ,~- -r^TAAA^A ATTrCAGGCT ~AGAAAATGT TGATATTGTT 2 2 80 
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T ATG TAG AAA GCGCAgcTAG CGGCTTAGTT GCAGGTATCA ATCTTGCGCA TAAAATATTA 24 6 0 

GGCAAGGGTG AGGTAGTATT TCCGAGAGAA ACAATGATTG GAAGTATGGC TTACTATATT 2 52 0 

5 

TCTCATGCTA AAAACAATAA GAATTTCCAA CCTATGAATG CTAACTTCGG GTTATTACCA 2 580 

TCTTTAGAAA CTAGAATTAA AGATAAAAAA GAACGCTATG AAGCACAAGC TAATAGAGCT 2 64 0 

1Q TTGGATTACT TAGAAAATTT CAAAAAAACT TTATAAAATA GTTAGAAAGA CTAGATATGC 2 700 

TATTCATTCT TAAGTCATCA ACGAGTAAGT AATGACTTTC TAAATGGAAA ATACTTATCC 2 76 0 

TAGTCTTTTT AATTTTGGAA TTGTTACGTA TTTCTGACAA TTTAGAATTC GCATTCAAAA 2 82 0 

15 AATATCTAAA TAAATAACAC GCAATAAGTT GATTGATGTA ACATGTAAGA GAATGTTTTA 2 8 80 

AATAAACTTT ATTTAAAAGG CAATGAAATA ATAAATGGCA AGGCTATTAA TAAAGACTTT 2 94 0 

TAGTAATTAA TTTAAAAAAG AGGTATTCTA ATTAACAGGT TTTCCGATTA GTTACAATTA 3000 

20 

TTTAATTCTC AAAAGATTTA GAATTGATTA TCAAATTACT GTAAGCCCTT TGCTGTATAT 3 060 

GCTACAATTC TTATTGATGG AGGGTAAATG TATTGAATCA TATTCAAGAT GCGTTTTTAA 3120 

ATACATTGAA AGTTGAACGG AATTTTTCGG AACACACATT GAAATCATAT CAAGATGACT 3180 

25 

TAATTCAGTT TAATCAATTT TTAGAACAAG AACATTTAGA GTTGAATACT TTTGAATACA 324 0 

GAGATGCTAG AAATTATTTG AGCTATTTAT ATTCAAATCA TTTGAAAAGA ACATCTGTTT 33 0 0 

30 CTCGTAAAAT CTCAACGTTA AGAACTTTCT ATGAATATTG GATGACGCTT GATGAGAACA 336 0 

TTATTAATCC ATTTGTTCAA TTAGTACATC CGAAAAAAGA AAAATATCTT CCGCAATTCT 342 0 

TTTACGAAGA AGAAATGGAA GCGTTATTCA AAACTGTAGA AGAGGACACT TCAAAAAATT 34 8 0 

35 

TACGGGATCG AGTTATTCTT GAATTGTTGT ATGCTACAGG CATCCGTGTT TCGGAATTAG 3 54 0 

TAAATATTAA AAAACAAGAT ATAGATTTTT ACGCGAATGG TGTTACCGTA TTAGGAAAAG 3600 

GGAGCAAAGA GCGCTTTGTA CCGTTTGGTG CTTATTGTAG ACAAAGCATC GAAAATTATT 36 6 0 

40 

TAGAACATTT CAAACCAATT CAGTCATGCA ATCATGATTT TCTTATTGTA AATATGAAGG 3720 

GTGAAGCAAT CACTGAACGC GGTGTACGAT ATGTTTTAAA TGATATTGTT AAACGAACAG 3 780 

45 CAGGCGTAAG TGaGATTCAT CCCCACAAGC TCAGACATAC ATTTG CAACG CATTTATTGA 3 84 0 

ATCAAGGTGC AGACCTAAGA ACAGTACAAT CGTTATTAGG TCATGTTAAT TTGTCAACAA 3 90 0 

CTGGTAAATA TACACACGTA TCTAACCAAC AATTAAGAAA AGTGTATCTA AATGCACATC 3 96 0 

50 

CTCGAGCGAA AAAGGAGAAT GAAACATGAG TAATACAACA TTACATGCAA CAACAATTTA 4 02 0 

TGCTGTAAGA CATAATGGGA AAGCAGCTAT GGCTGGAGAT GGGCAAGTAA CGCTTGGTCA 4 08 0 

ACAAGTCATC ATGAAACAAA CGGCAAGAAA AGTGCGACGT TTATATGAAG GTAAAGTGTT 414 0 
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ATTACAACAG TTTAGTGGTA ACTTAGAAAG 
AGGCGATAAA CAATTACGTC AATTAGAAGC 
7TTAGTTGTC AGTGGAACTG GCGAAGTTAT 
ATCAGGAGGC AACTACGCAT TAAGCGCAGG 

w GTCTGCTGAA GAAATGGCAT ATGAGAGCTT 
CAACGATAAT ATTGTTGTCG AAACACTATA 
AATTAATTTT AGTTAAAAGA CGGAGGAATG 

75 ACTCCAAAAG AAATCGTATC TAAATTAAAT 
CGTAAAGTGG CAATTGCCCT ACGTAATCGA 
AAGCAAGAAA TTTCACCTAA AAATATTTTG 

20 

GAAATTGCAA GAAGAATGGC CAAAGTTGTC 
AAATTTACTG AGGTAGGTTA TGTAGGACGA 
GATGTTTCAG TAAGATTAGT CAAGGCGCAG 

25 

GCTAAGGCCA ATGAAAAACT TGTTAAGTTA 
CAAACGAATA ATCCTTTAGA GTCACTTTTC 

30 AACGAAGATG AAGAAGAACC ACCTACTGAG 
AGACAGCTAG AAGAAGGCAA ACTTGAAAAA 
CCTGGTGCTT TAGGTATGCT AGGTACAAAT 

35 CAATTAATGC CTAAAAAGAA AGTTGAGCGA 
TTAGCTGATA GTTATGCGGA TGAACTAATT 
GAATTAGCAG AACAAATGGG TATCATCTTT 

40 

AATCATAATA GTGGTCAAGA TGTCTCAAGA 
CTTGAAGGTA GCGTTATTCA AACCAAATAT 
4S ATAGGTGCTG GAGCTTTCCA TGTATCTAAG 
CGTTTTCCGA TTAGAGTTGA ACTTGATAGT 
ACAGAACCAA AATTGTCATT AATTAAACAA 
ACTGTAAACT TTACCGATGA AGCAATTACT 



AGCTGCTGTT GAATTGGCAC AAGAATGGCG 4 26 0 

TATGCTAATT G T AATGG AT A AAGATGCTAT 4 32 0 

TGCTCCAGAT GATGACCTTA TCGCTATTGG 4 3 80 

ACGTGCATTG AAACGCCATG CATCGCATTT 4 44 0 

GAAAGTAGCG GCTGATATTT GTGTCTTTAC 4 500 

ATAATCAGAG CACGATAAAT AATTACGAGC 4 56 0 

AAATTAATGG ATACAGCTGG AATAAGATTA 4 62 0 

GAATACATCG TTGGACAAAA TGATGCTAAA 46 8 0 

TACAGAAGAA GTTTATTAGA TGAGGAATCA 474 0 

ATGATTGGAC CAACCGGCGT TGGTAAAACT 4 800 

CGCGCGCCAT TTATAAAAGT AG AAG CTACT 4 860 

GATGTTGAAA GTATGGTTAG AGATCTTGTT 4 920 

AAAAAATCAT TGGTACAAGA TGAAGCAACA 4 980 

TTAGTTCCAA GTATGAAAAA GAAAGCGTCT 504 0 

GGAGGTGCAA TTCCAAATTT CGGACAAAAT 510 0 

GAAATTAAAA CAAAACGTTC TGAAATTAAG 5160 

GAAAAGGTAA GAATTAAAGT CGAACAAGAT 5220 

CAAAATCAGC AAATGCAAGA GATGATGAAT 52 80 

GAAGTTGCTG TTGAGACGGC AAGGAAAATC 534 0 

GATCAAGAAA GCGCTAACCA AGAAGCGCTT 54 00 

ATAGATGAAA TCGACAAAGT TGCGACGAAT 54 60 

CAAGGTGTTC AAAGAGATAT TTTACCTATA 552 0 

GGTACTGTGA ATACTGAACA TATGCTGTTT 55 8 0 

CCGAGTGACT TGATACCAGA ATTG CAAGGT 564 C 

TTATCGGTAG AAGATTTTGT AAGAATTTTG 57 00 

TATGAAGCAT TGCTTCAAAC AGAAGAAGTT 57 60 

CGCTTAGCTG AGATTGCTTA TCAAGTAAAT 582 0 

^"T^ATACAA TTTTAGAAAA GATGCTAGAA SRRC 
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AAATATACAA AAGGAGAAAA ATTCATGAGC TTATTATCTA AAACGAGAGA GTTAAACACG 6 06 0 

TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 612 0 

5 

AGCGTAACTG TAACAAATGT ATTTATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 618 0 

CTAAATGAAT TATTAAAAAG TCAAAGAATT ATTCAAATGT TGGAAGAAAG A CAT ATT CCA 6 24 0 

AGTGAATATA CAGAACGATT AATGGAAGTT AAACAAACAG AATCAAATAT TGATATCGAC 63 0 0 

AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT TCATAGATAG TCGTACAACT 6 36 0 

ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 6420 

15 GATGATTTTA ATGaAAATGA TTTGGTACTA GGTGAATATG CTGCTACAGT TATTGGTATG 64 8 0 

GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CGCGCGATAA AGCTGCTATT 6 54 0 

ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CGATTGAACA TATCTTTGAA 6 600 

20 

GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 666 0 

ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTGCTGGTGT AATTGAATCA 6720 

CGTTCTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 6780 

25 

TTAGAAAAAA GTAAAT 67 96 
(2) INFORMATION FOR SEQ ID NO: 3: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATCCTAAAAT TnAAAATTAT CACGCCTTTT GaACAGCTTT GTAACCaTCt GG A CG AT CAT 60 

40 

kAAATTCCaA TGTAAATCCT GGTTTAAaGT TGATCTTTAA CCTTATTTAA AyCACCAATT 120 

GTACGTATAT TATGTTGTTT AGCAAAATCA CGTTTTACAG CTAAAGCATA CGTATTGTTA 180 

45 TACTTCATTG GTTTTAACAT AGTCATTTGA TATTTCTTTT CAAGACTTTG CTTAGCTTGT 24 0 

TCATAAACTT TTTTCTCTTC TTTTGACTTC AATGGTTCTT TTGTTAATTC ACCTAAAACT 3 00 

GTTCCAGTAA ATTCTAAATA CCCATCTATA TCGTCAGATT TTAAAGCATT AAATAAAAAT 360 

50 GCTGTTTTGC CCATACCATC TTTCACTTCT ACAGTATTTT TGGTCTCTTC TT CT ATT AAA 4 20 

ATTTTATACA TATTTGTAAT AATCGATGGC TCGGAGCCAA GCTTTCCAGC TAACGTAATT 4 80 

TTATCACCTT TTTGTGCAAA CATAGGAATA GCGATAGCCA GTATAATAAT CATCACTATA 54 0 
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TCAAATATAA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATAT CAA CGATGCATTG 6 60 

TTACGGTCTA TACCTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATG CTGCT 7 20 

AGTGTTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 7 80 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATAC CTTTA 84 0 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT TTAATTCCAG TATACGTATT CCTTAAAATT 900 

GGTAACAACG CATACACTAC AAGTGCAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 9 60 

GGAATCATTA AACCTAATAA TG C CAA CG AT GGTATGGTTT GAAGAATTGC CGCAATATTC 10 20 

ATTACGATTT CAGATATCGT TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 10 80 

GCAGTTGCAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 114 0 

AGTTGCCCCT TACGTTCACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 12 00 

20 TTTCTGGGAC AAATATTTGA AGAT AT CTTT CCTATCAATA ACATATTGAC CTACGCTATC 12 60 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 1320 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA ACCTCATCGA TTGGTTTCAT 13 8 0 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 144 0 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 1500 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATCACCAA GTTTCATCGC 1560 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 1620 

TAAATCATCT TGAAGTTTTT CTCGGCTGAT TGGGTCTAAT GCACTAAACG GTTCATCCAT 16 80 

TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 174 0 

CCCTGACAAT TCAT CAGGTT TTCTGTTTTT ATATTTTTCA GGTTCTAATC CAACCATTTC 18 0 0 

AAGTAATTCA TC TACT CTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 18 60 

TTGTGCAAtA TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 1920 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 19 80 

AATATAACCT TCACTTAAGT GAATGAGTCG ATTAATCATT TTTAATGTCG TAGTTTTTCC 2 04 0 

ACAACCTGAA GGTCCAATTA GCACAAAAAA TTC 2 07 3 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 6 0 

5 AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 12 0 

TGAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

TACATCTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 24 0 

10 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 3 00 

AATAGCATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 360 

15 GTTTCACTTA GTCCAGGCAT ACCGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 420 

TCATAAATGC CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 480 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 54 0 

20 CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 6 00 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 6 60 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 720 

25 

ATGATCATCA TTTTTAAAAG ATTAGCGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 7 80 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 84 0 

ACTAGTGTTC TTTTTTAGCT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

30 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 96 0 

AGCCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 102 0 

35 AT C AAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 1080 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 114 0 

TACCAAACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 1200 

40 

TTGCAACAAC CATTCAATAC CACCATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 13 80 

45 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 144 0 

TACAACTTTA ATTAGATTAT AATCATAGTt TTTAGCATGA TTTAAAGAAA TGCCATTCGT 150 0 

so TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 1560 

TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 162 0 

AAT CATTG AT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGctAGCGC 16 8 0 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT 1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT 1860 

5 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 192 0 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 198 0 

ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 2 04 0 

w 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 2100 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 216 0 

15 AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 222 0 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 22 8 0 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 234 0 

20 

CTTAGTACGT C AAT CAT AAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 24 0 0 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 24 6 0 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 252 0 

25 

AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 258 0 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 2 64 0 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 27 0 0 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACGT ATTTGTTAAA CTTAATGATA 2 76 0 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2 82 0 

35 AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 288 0 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 2 94 0 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT 30 0 0 

40 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3 06 0 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

J5 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 3180 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 324 0 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAGCCGA 3300 

50 TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 33 6 0 

- • — -r. - - ~:a^AAA , T AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 34 2 0 
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GTTTTTTGAC CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT TTGTAATGGT 3 600 

GACCAGAATG GACCAGGCGC TACACAGTTC ACTCTAATTC CTTTTGGTCC TAATTCTTCT 3 660 

GAAAAACTTT TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC ATGAAGAATA 3 720 

GGACTAGGAT TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC CGGTTTTAAA 3 780 

TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC AAATGTTTCT 3 84 0 

GTAAATGCCT CAGTTGTAAA TCCATGAATA TCATCATGAT ACTGTTGATG TCCAGCAACT 3 900 

AAAGTAACAT TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG GTCATAGTTG 3 96 0 

AACTGTTCAT CTCTTATATC ACCAGGAATT AACACTGCCT TTTGACCACT TTCTTCAATC 4 02 0 

ACTTGGCGTA CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT CGCTACATCT 4080 

GCACCTTCTT TAGCATACGC AATTGCTGCT GCACGCCCTA TTGCTGAGTC ACCACCTGTG 414 0 

ACTAATATTT TATAGCCTTG TAAGCGTTGA TGACCTTGGT AAGACGTTTC GCCACAATCG 42 0 0 

GGTGCTGGCG TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT TTCATAATCC 426 0 

GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT CCTTATTCGC 43 20 

TTAATGGTTA TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA CTAAAAATTA 43 8 0 

CTTTCTTCTT TATAAAAACA AGCTCGAATT ATTCATGCAA TAGTCTCTTT ACAAATTCAA 4 44 0 

CAAAATACTC AGGTACTTTT TCCAGAATCC TTTCATCCGG TTTATATTGA GGATGATGTA 4500 

AATCATATTC ACTATGAGAA CCAATTAACG CAAATACACT TGGAAAATGT TGACTATAAC 4 560 

CTGAAAAATC TTCTCCAATC GTAAGCGGCT GTTCCATCAT TCCCACCTTA TATCCAACAT 4 62 0 

35 GTTGGGCTAC TGCAATTGCT TTATGCGTCA ATGCCTCATC ATTCATCACA GCGCCAGGTA 4 68 0 

AATGCGTATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG TCCGCAATAT 4 74 0 

CTTGTAATCG TGTTTCTACA AGCTTTCGTA CCACAGGATC AAAACTACGC ACTGTGCCTT 4 800 

GTACATACGC ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT TGTCCAATTG 4 860 

TTACTACCGC TTCATCAAAC GCAGATAGAT TTCTACTAAC TATGGATTGA ATACTATTAA 4 92 0 

TCAATTGCGC CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA GCATGACCAC 4 980 

CCACGCCTTT AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC CCTGTTTTGA 5 04 0 

TTGCAAATGT ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT GCTTGTACAT 5100 

50 CTTTTAATGC ATGTGTTTCA ATAATTTTAA AAGCGCCATG TCCTAGTTCT TCTGCTGATT 5160 

GAAAAATGAA TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT TTTACAGCTG 522 0 

TAGCCAAAAT ACTAGCCATG TGAATATCAT GACCACACGC ATGCATAACA CCTTCATTTT 5 2 80 
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CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTCGCA AGTGGTAAGT 54 00 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT TTTTTGTGTA GTCTTAAATT 54 60 

5 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCAGGTAACA GCTTGATCTT 552 0 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 5 5 SO 

TCATCTTTGA CTATTAAACG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT 564 0 

10 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 57 00 

AATTGAGACT CTATACAAAA ACGTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 576 0 

15 GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAGCTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 58 80 

CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 594 0 

20 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG ACCGTAACAG 60 0 0 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6 06 0 

GTAAGTTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

25 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AATCTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GTATCACTAT TTCGCAACTT TTCTTTTCCT TTATCACTAG 6240 

30 AAGCTACTAA GTGGTCTGCT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 63 00 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG GCTTTTTGCA 63 60 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCCGCAAAGT 64 20 

35 CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 64 30 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CTAGGAAAGT ATGTTTCATG TCTAAAT CAT 654 0 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTGCTTCAG TTCATTTTCT CTATCTAATC 6 6 00 

40 

CATAACCACT CTTACTTTCA ACTGCAAGCA CGCCGTGTTT AATCATAGTA AGCAAATCAT 66 6 0 
GCTCTGCTTT TTT AAA C AAG TCATCTTCGG ATGTTTCTCT AGTAGCATTA ACGGTAGATA 67 20 
ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 67 80 
TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 6 84 0 
ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATC GTAGTCATCT GTATGTGTTC 6 9 00 
50 CAGCATATAC AATTTTGCCA TCTTTAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 69 60 
- -. — a—— AAA^r^^AT C7GTTGATCT CGGTAAAATT AATTCTGCTA 7020 
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w 



20 



25 



AACACCCATA CCTGGGTCAG TCGTCAATAC ACGTTCCAAT CTTCTTTCAG CACGCTCTGA 72 0 0 

TCCATCTGCT ACAACAACCA TACCCGCATG AAGTGAATAT CCCATGCCAA CACCGCCACC 72 6 0 

GTGATGGAAT GAAATCCATG AACCACCTGC AGCTGTGTTA ATGAGTGCAT TCAATACAGC 73 2 0 

CCAATCACCA ACCGCGTCAC TACCATCTTT CATACTTTCT GTTTCACGGT TAGGACTAGC 73 8 0 

AACTGAACCA GCATCTAAAT GGTCTCGTCC AATAACAATT GGTGCTGAAA TTTCACCGTC 744 0 

ACGTACAAGA CGATTTAAAG CTAAGCCCAT TTTCGCTCTT TCTCCATAGC CTAACCAAGC 75 0 0 

AATACGTGAT GGTAGTCCTT GATATGAAAT TTTTTCTTCA GCTAAATCAA GCCATCTTAA 756 0 

15 TAACTTTTCA TTTTCTGGGA AAAGTTTGCG CATTTCTTCA TCCGCACGCT CGATAT CTTT 7 62 0 

TGGATCACCA CTCAACGCAG CAAAGCGGAA TGGCCCTTTA CCTTCACAGA ATAATGGTCT 76 8 0 

AATGTAAGCT GGTACAAAGC CTGGGAAGTC AAAAGCATTT TTCACTCCGT TATTGAAGGC 774 0 

TACTTGACGA ATATTGTTAC CATAATCAAA TGCTACAGCG CCACGTTTTT GGAATTCAAG 780 0 

CATTAATTCA ACATGCTTTG CCATTGAAGC TTGTGACAGT TCAACATATT TTTTCGGATC 7 86 0 

TTTTTCACGC AATACTTTCG CTT CTTCTAC AGAGTATCCT TGTGGCACAT ATCCATTTAG 7 92 0 

CGGATCATGT GCACTTGTTT GGTCAGTAAT AATGTCAATT TTAAATCCTT TTTCTAGAAT 798 0 

CGCTTGATGG ATGTCTACAG CATTTCCAAC TAACCCGATT GATAATCCTT CTCCACGTTC 8 04 0 

TTTCGCCTCT TCTGCTAATT TTAATGCTTC ATCTAAATCA GCTGTTTTAA CATCACAGTA 810 0 

TTTCGTATCA ATT CG CTT AT CAACACGTGT TTCATCAACA TCCACGCAAA TTGCTACCCC 8160 

ATGATTCATA GTAATTGCTA ACGGTTGCGC ACCACCCATA CCACCTAAAC CTGCTGTCAG 822 0 

TGTAACAGTG CCTGCTAAAT CTCCATTAAA GTGTTGATTA CCTAGCTCGG CAAATGTCTC 828 0 

ATAAGTACCT TGCACAATAC CTTGAGAACC AATATATATC CAACTACCGG CTGTCATCTG 834 0 

TCCATACATG ATTAAACCTT TTTTATCTAA TTCATTAAAA TGATCCCAGT TTGCCCATTC 84 0 0 

AGGCACTAAT ACTGAATTTG AAATTAATAC ACGTGGCGCT TCTTCATGTG TTTTAAATAC 84 6 0 

AGCAACTGGC TTTCCTGATT GTACTAACAT TGTCTCATCT GATTCTAATT CTCGTAACGT 852 0 

TTTCTCTATT GCTTCAAAAG CTTCCCAATT ACGTGCTGCT TTTCCAATAC CACCATAAAC 858 0 

AACTAAATCT TCTGGTCTTT CAGCAACTTC TGGGTCTAAA TTGTTGTATA ACATT CTAAG 8 64 0 

TACTGCTTCT TGTTCCCAAC CTTTACACTC AATACTCAAA CCTTTTTTTG CTTGAATTTT 8 70 0 

so TCTCATAAAA TTCGCTCCTG TTCTTTTAAG AAGTTAATTC CACTAAATTT AAAACGCTTA 8 76 0 

CATTATTATC TTCAATATTC ATTATAGTAT GTTAAAATAT AGCCAACAAA TATAAATAAA 8 82 0 

CTAATTATCC ATAGCTTGAA TCTATAAATA AAAGGAGCAA AACACATGAA AATTATTCAG 8 88 0 
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CATATTAGCC AGCCATCTTT AACTGCTACG ATTAAAAAAA TGGAAGCAGA TTTAGGTTAT 90 0 0 

GACTTATTTA CACGTTCAAC AAAAG AC AT C AAGATTACCG AAAAAGGAAT ACAGTTTTAT 9 06 0 

5 

CGTTATGCGA GCGAATTAGT T C AACAAT AT CGATCCACGA TGGAAAAAAT GTATGATTTA 9120 

AGCGTTACA7 CAGAACCAAG GATAAAAATT GGGACTCTTG AATCTACGAA TCAATGGATT 9180 

w GCGAATTTAA TTCGAAAGCA CCATTCCGAC TACCCTGAAC AGCAATATCG TTTATATGAA 924 0 

ATACATGATA AACATCAATC TATAGAGCAA TTACTGAATT TTAATATTCA TTTAGCTATA 93 0 0 

ACAAATGAAA AAATAACCCA CGAAGATATA AGATCCATTC CTTTATATGA GG AATCTTAC 936 0 

15 ATTTTATTAG CACCCAAGGA AACATTTAAA AATCAAAATT GGGTAGATGT TGAAAATTTG 94 2 0 

CCACTCATAT TACCAAACAA AAATTCTCAA GTGCOCAAAC ACTTAGATGA CTATTTTAAT 94 8 0 

AGAAGAAATA TTCGTCCAAA TGTCGTTGTA GAAACAGATC GATTCGAATC AGCAGTTGGA 954 0 

20 

TTTGTTCATC TCGGCTTAGG TTACGCTATC ATTCCGACAT TTTATTACCA ATCATTTCAC 96 00 

ACGTCTAATT TAGAATATAA AAAAATTCGT CCAAACTTAG GCCGAAAAAT TTATATCAAT 966 0 

TACCATAAAA AACGCAAACA CTCCGAACAA GTACATACAT TCGTACAACA ATGCCAAGAT 972 0 

25 

TATTTATATG GACTTTTAGA GGCTCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 978 0 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 984 0 

30 CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9900 

CTCAGTCAAC TGTATACCTT TTTCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 936 0 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGTGCCTCT TATGTAGTTG 10020 

35 CGTAGTCAaC TGTaTACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 10080 

CGCAGATCAT CGTATAAAAA TTAATGACGT CATTTCAAAA ATCGATACAA AAATAATTTA 1014 0 

TTATAAAAAT TCTAAGAAAG AAGTGAAGCA GATGTTAAAA TCTATTAATC ATATATGCTT 10200 

40 

TTCAGTCAGA AATTTAAACG ATTCAATACA TTTTTATAGA GATATTTTAC TTGGGAAATT 10260 

GCTATTGACT GGTAAAAAAA CTGCTTATTT TGAGCTTGCA GGCCTATGGA TTGCTTTAAA 10320 

45 TGAAGAAAAA GATATACCAC GTAATGAAAT TCACTTTTCA TATACACATA TAGCTTTCAC 10380 

TATAGATGAC AGCGAATTTA AATATTGGCA TCAGAGGTTA AAAGATAATA ACGTGAATAT 10440 

TTTAGAAGGA AG AG TT AG AG ATATTAGAGA TAGACAATCA ATTTACTTTA CCGACCCTGA 10500 

50 TGGTCATAAG CTAGAATTAC ATACTGGCAC ACTTGAGAAC AGATTAAATT ATTATAAAGA 10560 
— — ~ — a-aa^a A^H^nT^ATT ATAAAAArrHP CTCTTGAACT 10620 
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TTACTGCAAT TATTTTTCAA ATATATCAAC GTTAATATAA CTTCTATTAA GAAATACTCA 10 80 0 

CATTCTGCCC TGCAATGCAA ATCTCGTCAC ATATAAATAT TTTTAATTAT TTTAAAAAAT 10 860 

GATGCACTAA ATTAGCAACG AGCTTAGCAG TTCTATTGTC AGCGTCATAT GTTGGATTCA 10 920 

TCTCAGCAAT ACTAACTGAA GACACCTTAT CACTTGGAAT AATACGTTTT GCTAATTCAA 1098 0 

GAACAGTATG TGGATACAAA CCTAACACTG CCGGCGCACT TACCCCAGGC GCAAACGCAC 11040 

10 

TATCAATGAC ATCCATACAA ATCGTAAACA TAATGACATC ATGTTCATGT ACAAAACGTT 1110 0 

CAATCATATC TTTAATTGTT GGTGATACGT GACTCAATAA TTCATCTGCA AAGACATAAT 1116 0 

is CAATCTTTTT CTCTTTAGCA TAATCAAATA AACTTTGCGT ATTACCACCT TGAGCAATAC 1122 0 

CAAGCACTAA ATAATCTGTG TTTTCATCTT CTTCTAAAAT TTGTCTAAAG CTCGTTCCAG 1128 0 

ATGTAGATTG TTGTTCAGCA CGTGTATCAA AATGCGCATC AATATTTATC ACACCAATAG 1134 0 

20 

ATTGTGTTGG ATAGACTTTA CGTGTTGCTA AATATTGAGC ATACGCAATA TCATGTCCAC 114 00 

CACCTAATAA AAATGTTTGT CTATGATTAG CAATTGACTT CGCTGCAAGC ATAGCAAATT 114 6 0 

CTTTTTGAGT ATCAATTAAT TCCTCATGAT CATGATAAAC ATTTCCGTAA TCGACTAAAG 11520 

25 

TTcACATTGA TTCAAATCCG GCAAACCTGC AAATGCTTGT TTAATCGCAT CTGGTCCTTC 11580 

TTTTGCACCA ATGCGCCCCT TGTTTAAAGC AACACCTTTG TCAACAGCAT AGCCTAATAT 11640 

30 ACCGACCCCT GATGGCATAC TACTCTTTTC CAGCTTAGAC AAATCTTCAA ATGTTACTGT 117 00 

TTGAAAATGT CTAAATTTTT TCGGGTCTGT TTCACTATCT AACCTTCCAG T CCATAAATT 117 60 

TGGTTCACCT TGCTTGTACA CAGCATTTCC CCCTCTTATT TATGTGGCTT ATTAACAATT 11820 

35 AAAGTATAAC GTATAGGAAA TTTTGAATTC AATTCATAGT TAAATCCGTA TCTTAAAAAT 118 80 

ACTTATCTAC ATTACTTTTA CCCCTATTTT CTATGTAATA ACGAATACTT AGCTGATTTA 11940 

TGTTAATAAA ATACGTCAAG ACTATTACAT TTTCATTAAT ATTGACATAG ACAATTTATC 12 00 0 

40 

TCTCGGCTTG TAATATGTAT AATTGTTACT AAAAGATATT TTGCTTGTTA CCTAATGGAG 120 6 0 

GTTACATATA ATGAAGAACA ATAAAATTTC TGGTTTTCAA TGGGCAATGA CGATTTTCGT 1212 0 

45 CTTCTTTGTC ATTACAATGG CGTTATCCAT TATGCTCAGA GATTTCCAGT CTATAATTGG 12180 

TGTCAAACAC TTTATATTTG AAGTTACAGA TCTAGCACCA TTAATTGCTG CAAT C ATTTG 12240 

TATACTCGTT TTCAAATATA AAAAGGTCCA ACTTGCAGGT TTAAAATTCT CAATCAGCCT 12300 

50 GAAAGTAATT GAACGTCTAT TGCTAGCTTT AATTTTACCT TTAATTATTC TAATTATTGG 12360 

TATGTACAGC TTTAATACAT TTGCAGATAG CTTTATTTTA TTACAATCAA CAGG CTTATC 124 2 0 

AGTACCTATT ACACACATTC TGATTGGACA TATTCTGATG GCGTTCGTAG TAGAATTCGG 12480 
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5 



TGTTGTTGGT 


TTGATGTATT 


CAGTTTTCTC 


AGCAAATACA 


ACTTATGGTA 


CAGAATTTGC 


12600 


TGCTTATAAC 


TTCCTTTATA 


CATTCTCATT 


CTCTATGATT 


CTTGGTGAAT 


TAATTAGAGC 


12660 


GACTAAAGGA 


CGTACAATTT 


ATATTGCAAC 


GACATTCCAT 


GCTTCAATGA 


CATTCGGACT 


12720 


TATTTTCTTG 


TTTAGCGAAG 


AAATCGGCGA 


TCTATTTTCA 


ATCAAAGTCA 


TCGCCATTTC 


12780 


AACAGCAATC 


GTTGCAGTAG 


GATACATTGG 


TTTAAGCTTA 


ATTATCCGAG 


GTATTGCATA 


12840 


TTTAACAACA 


AGACGAAACC 


TTGAAGAACT 


TGAGCCTAAT 


AATTATTTAG 


ACCATGTCAA 


12900 


TGACGATGAA 


GAAACTAATC 


ATACTGAGGC 


TGAAAAATCT 


TCTTCAAATA 


TTAAAGATGC 


12960 


TGAAAAAACA 


GGTGTAGCTA 


CTGCATCAAC 


GGTTGGTGTT 


GCTAAAAATG 


ATACTGAAAA 


13020 


TACAGTGGCT 


GACGAACCAA 


GCATTCATGA 


AGGTACTGAA 


AAAACAGAAC 


CTCAACATCA 


13080 


CATAGGTAAT 


CAAACTGAAT 


CTAATCATGA 


TGAAGATCAt 


GACATCACTT 


CGGAGTCAGT 


13140 


AGAATCAGCm 


GaATCAGTTA 


AACAAG CACC 


ACmAAGTGAC 


qATTTaACAA ACGATTCAAA 


13200 


TGAAGATGAA 


ATAGAGCAAT 


CATTAnAAGA 


ACCTGCGACT 


TATAAAGAAG 


ACAGACGTnC 


13260 


ATCAGTTGTA 


ATTGATGCAG 


AAAAACATAT 


CGAAAAAGCT 


GAAGAnCAAT 


CTTCAGATAA 


13320 


A 












13321 



(2) INFORMATION FOR SEQ ID NO: 5: 

jo (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8549 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATGTGTTGTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAGCATTT 6 0 

40 

AATAAATTAA TTAGTATACA GCTAGTTTTT CTAATTGTTC TTTAACTTGA ATTAAGTTTG 12 0 
ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 18 0 
4S AGCCATTACA AACAACTTCA AACTGTTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 24 0 
TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 30 0 
AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 36 0 
ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 42 0 
- — T pr< AT~AATAGGT GGTGTGGTTT TGTTGATTGT AGGGGTATAA 4 80 
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10 



20 



25 



AAACAGGACT CCACATAAAA ATCAACTCCT TTATATACCA TAATGATACT ATATTTTCTA 6 60 

GTTTATTTCA ATTTTTCAGT TTTTAAAAAT GAGTTTCTGT TTTTATTTAT ACGCTTTTCT 720 

GTTTTCTTTT TAAATTTTAT CTTTTTGTTA TTCCATTCAT TGTAAAATTC TATTAAATTA 780 

ACATAAAATT TTTCATGCCC TATTTTATTT GTTGATGAGA TATCAATGTA AAGACTCAAT 84 0 

ATTGTTTTTA AATAGATTTG ATGCAACGAC TGATAAACCG TATTACTATC TGCTATGTTA 9 00 

TTGGTAAAAT GCATAGAAAA ATATTCTAAT TTATTCATGC AATATATATG GGTTTCATTA 96 0 

TACTTCTTAA TGAGTGTATT TATACCTTGC AATACGTCAT TACTTTTAAT AACAATTTCT 1020 

15 TTTTCACCTG TCGAAAAAGT CCACTGTTTA TCTCCTATAT TTTCTTTAAT TGTTTTCTTG 10 8 0 

TTGTCAAATT CTAAAATTAT AG CC CGT AAA CACTCTTCTT TATAATTCTC GTTCTTGAAA 114 0 

GTACGAAGCA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 120 0 

GCAATCTCTT TATCAGTAAA GACTGTTCTT AGTTCGTGAT TATCAAAACT T AAATT CATC 1260 

TTATTCTCTA ATTCATTAAT TTTATCTTGC AAACCAACAT TTTCTAAAAT TTTCTTGTTT 132 0 

ATCTCCCCTA TATCAAAACT CCTTTTCGAA ATTAATTTTG AAAACTCGTC TGCCATTTCA 13 8 0 

ACAGCCTTTT CTTTCCTTTT ATACCTTTTG TTAAATTTAT GAACCACCGT TGCAGCATAA 144 0 

TACGAT AT C C CAC CAGATAA AATAGATGaT ATTATCGGTA TGTATATATC ACCTTTCATA 150 0 

30 TTTCCACCTC TTTTAACACA ATTAAGTATT ATGATACACA ACTTGCGCAA AAAGATGTAG 156 0 

ACAGAACATA ATGGCGAACA AAAACAACCA CCCAGTAACT AGTATGGGTG GCGTAgACTA 162 0 

TAACAACTCT ATGTTATCAA GATATATGTA TCGAGTGATG GCAAGGAAGA AGTCTCCTGC 16 8 0 

GGGAC CAACA GTCAGATATA TGGCCTCTGC CGGGCTATAT AGTTCACTCC TACTATATAA 174 0 

AAGTAAGTAT AACATAAAAA GCACCCCGTA AACTGTTATA CGGGAATGCT AAAGTCATAT 18 00 

ATACTACGGG GAGTAGTATG AAAACTATGC TCTCTATCGT AAGAAAAAAC ACCCAGTGAC 186 0 

ATGCTTGGGT GAACAAGGAT AGATGTAAAT AGTTGATGCA TGTGTAcACA TCATAACAAA 1920 

AAACTAGCCC GAAGcTAGCT AT AA CAT AAA AAAATAGGCA AGTACCGAAG TACCTGCCAG 1980 

TTACGCACAT TTAAATCTTG AGAGTAATGT TAAAAAGTGT ATAGGAATAT TAACATCCAT 2 04 0 

CCAAATAGTT ATTTAATAAC TGTAAGATTC CCTATAATTA ATGTAGCaAA ATTTTTATTC 2100 

TAAGTAAATA CTAAATCGTG CTAAACTTAC CAAAACTACT TATTCTATTA CCTGCCTTGT 216 0 

CTACCTCTCC TGTCGCTATA TAACGACGTT GTCCACTATT AGCAATATAA GTAATCCATC 2 220 

TATAGCCATT GATGCAATAT GCGCCGTCAT ATTTAATTGT TGCGTTATTA GGTAATACAC 2 280 

CTGTAATTCT TGAATTAGTT GAATAGCCGT CCCTTACGTT ATTACCTTTA ACATTGGCAA 2 34 0 
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CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AACATTACCA GCTACCAAAC 24 6 0 

CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG TCATAGTTTT 2 52 0 

5 

TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 2 58 0 

TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 2 64 0 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTATCTGCAT 2 70 0 

10 

TGAATTGACT TGAAATAATA ACATGCCCAC CACTTGCACT TTCTCCTGCT GCGTCTAAAT 2 760 

GAATCTCTAG AACAATGTCA TACCCATGTG ATTTAACCCA ATATAAGCCA TAAT CTTT AT 2 82 0 

75 TATTTCCTAC ATTAACACCG TAAGCAGTAT CTTGATACAT ATCTTGTGAT TGACTTGAGC 2 880 

CACCATATAA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 2940 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCC GACTGCTCCA GGATCGTTAT 3 000 

20 AACCATGACC CGCTACAAGC ATAATTTTTT TAGGTTTAAT TACTGCTTGC TTTTTGGCAG 3 060 

TTGCTTGCTT AATAACGCTT TT AG CTTT AT CTCCAACACT TACTTTATCT GGG AAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 3180 

25 

CCCAACCAGG TTGCGCAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 3 240 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 3300 

30 ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCGGTAA AAAGCTATCA TAGTTTTTAA 3 360 

TTATTTGCCC GTATTTTTCA ATCCTTGCTT TATTATCAAA TGGAATATTA TAAGCGTATA 342 0 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 34 80 

35 GAAATCCATA AAACAAATCA GGATTGAACT GCTTCCCTAA TGAATTATCA AACCATTTTT 3 54 0 

CTGCTTGGTT TTTTGTTATC AACATTGGTC AACACCTACC CTAAATCATT TGTGTCGTTC 3 6 00 

ATATTCGTAG GTGTCATTAC TTCTTTAATT GGCGCTTGCC CTGTTGCTTT TCTATACTTG 3 660 

40 

TTTTCAGCTT TATATTTCTT TAGCTTTTGA TTTGCCCATT TACCTTCTTG AGATGTTGGA 3 720 

TT AT CTTT AT ATGTAGTATA TAAAGCAACA ACTGTTAAGA TAATCGATGA AACACTTTCT 3 780 

TCATCTACTG GTATCGGACT TATACCTTTA TTCGCTAAAA ACTGATTGAC TAATGCTAAG 3 84 0 

ATCAATACGA TGTATCTTGT TATTACTTTT GCATCCATTT GTTTGCTCCT TTTATCCAAA 3 900 

ATAAAAAGCC AGTGCCGAAG CACTGACTCT TAACTATTAC TTACACTTAC TAAACCAGAA 3 960 

^ ACACGACCAA AAGCTATATC CTAAAATTCC CTTAAGCATG GTAATCACCT CCTTTAAATG 4 02 0 

^^TTAA^AA GGCTATAACA AATGTACTTA GAATCGTCCC TATTAATCCT 4080 
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TGCGTTCTCA GACTGTCTTC TATTCTGTCG AATTTTTCAA ACATAGTCTT ATCATTTTCT 426 0 

TCTAATCGCG TTAAACGCCA ATCTTGTTCG TGTCGTTTGG TAAATCCAAA CATTACACCA 4320 

5 

CCCACTTTAT TCAAATTAAA AAGCCATAAG ATTATAACCT ATGACTCTAG ATTTTCTGGA 43 8 0 

TACTTTTCTC CTGTAATAAT TGCATATTCC TCTTTATCTA TAACTTCCAT AT CTACATAC 444 0 

CACGCTATAT CTTCTTTACT ATATTCTTTC AATTGATACC ATGTTTTAAT ATCTTCGAAT 4 5 00 

10 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 456 0 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC ATTTTGCTTA TTAACTTGCA TCGATAACTT 46 2 0 

is TGTACTTTGA ACAACTTGTT TCTGCATACT AGCAACCATT TTTCGTAAGA TGTCATCAGA 4 6 80 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4 74 0 

ATAATCTTCG TTAAAAACTA TTTCCCCATT TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4 8 00 

20 

TTGAGAGAAA TTTTCTGGTA AATTTTCAAT ATCAATACCT TCTTCAAAGC CACCAATGAT 4 8 60 

AG CGTATGAA ATTATCTCAT TACGCTTGTT AACTAATATT TGCATTATTT TCTCACTCCT 4 9 20 

ATAATTTTGT TAATTGTCCC TCTATTTGCG TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4 9 80 

25 

TCGAAATAGA CATCGTTTGA TATAGTTAAA GATGTACGAC TAGATTTAGT TAATCCAAAC 504 0 

TCATAAACAC CTCCACCATT TCCATCACCA TCTGGAAGAT TTGAGGGATT CAATGAAATC 5100 

30 TTTCCTCCTC CAAAAGGACT GCCAAACTCT GTAAAGTCAC CACCTGGAAA AGTCCCATAA 5160 

AAAATTAATA AAATAAATTG GTCTAAACTC TCATTTAAGT ACAATGTAGA GCCCACACCA 52 2 0 

TTTGCTGTTC CATCAAAAAT AACCGAATAC CTTTTATTAA ACTTGTCATC TGCGTATAAT 52 8 0 

35 TTAGCGTTAC TTTCGGCCAT ATTAGCTTTT GATTGGGCAC TTTGAACAGT TTCAAAAGGT 534 0 

GTATTGTAAT CATTAATAGC TAATTCTGAC CACTCAGACC ATGAACCCGC TTCTTTTCTT 54 0 0 

TTAACAAATA CTTTATTTGT ACCGTTCGGT CGATAAGTCA TACGCTTGTA ATCTGAAGTT 54 6 0 

40 

ACTACTAAAT ATTCGACAGT ACCGTTAGTA CTAACACCTC TTGGATAATT TATAGCTTGC 55 2 0 

GAAACATAAA TAAATTGGGT TGAATCACCT ATTCTTTGTT CTGGATTATT AAAATCAAAT 55 3 0 

45 CCAGTAATCT GCATTATCTT ACCATCATCT TTAGTAATCT TAGCTTTTTG CCAATTTGAA 564 0 

GTAGAACCAC TTGTGACTAA ACCACCACTA TTCACTGACT GCTTGAAGGC TTCATGTTTC 57 0 0 

TCATCCATAT ATCGCTTTTG CTCATCGAAT GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 576 0 

50 AAATCAGATA TATGGCTATT AGCAAGTTGC TTTAATTCAT CTATACTTGA AGATTTTGCT 58 2 0 

ATTTGAATAT CTGATAGACC TTTTTCTTTA GCTTTTTCAA TCAGACTCGC ATAATCTTCA 58 80 

C CATTTTTT A TAGCCTCGTC CATTG CTTTC GCACGATCCA TAATAGTTTT TTCTAATTCC 5 94 0 
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TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 6060 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

5 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT GACTCCTTTA 6180 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 6 24 0 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 6 3 00 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 63 60 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 6420 

is TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 64 80 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 654 0 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

on 

TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TCCCATTTAC 66 60 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 67 8 0 

25 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6 84 0 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6900 

30 TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6 96 0 

CTGTGTCACT CaTGATAAAA GGAACGCCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 702 0 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 708 0 

35 TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 714 0 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 72 00 

TCTCCTGTTT CTAAATCGAA AG C C GTT AAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 72 6 0 

40 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 732 0 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 7 3 80 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTTCTAGGT 74 4 0 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7 56 0 

50 TTAGGCGTAT ACTTGAAACG AACTAATGTA TTCTCATTAT TACCATTTAA GATAAAACTA 762 0 

-^a-^ATA ft^TrATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 76 80 
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ATTACTGCAT TTGTAAgAGG TGCAAGTTCT GTCACAAATA AAAATTCTTG CTTATCAGGT 7 86 0 

TCAAAACGAT ACTCGATATC AAGAATTTCT TGTTTGGTCT TATTTAATTC TCTTATAGTT 792 0 

5 

TCCTCTTTAT TAATTTGAGT TTTGGTTTCC CAATCGTCTA AATGTTCTTT TAATGTGTCA 7 9 80 

AAGGTTTCGC CGTTTACATT AACTCGAGCT TGAACAATCT CATTAGCACT GTTATTACGT 8 04 0 

GGTGCCACAA CAAGTGCGTT AATTTGACTT TGTAAAGATT TGTTTACTGC TGCTTGCGAT 8100 

CTACCATTAT AATAAATTTG CTCAGCGAAG TGTTGAATTG TTTTAGCTyT CTGATGCAAC 816 0 

TTAAACTCTG TTGTCAAGCC AAGCGCAAAT TGCTCTATTC TTTGTAAGTT TTGTATTTCC 822 0 

15 TT AG CT CT AT AATCTCGACC TGCTAAAGCT CCCAAATCCT TTATTAAATA CAAATTTTCC 82 8 0 

ATAATGCACC TTCCTTTCTA ATAAAATAGC ACTGTACCAA GTTTCCCACT ATCGTCAACT 834 0 

GTTATTTTCC ACAATTTACC GTTTGGGGAT TTCTGTACAA TGCTATTTTG AATAATTgcC 84 0 0 

20 

TGctTCGCCT ATTTTTAAAT TATCTAATTT ATTTkTATCA TTTAC CG AAA TGATACCGTC 84 6 0 

TTGAGGCAAT CCATCAATAn CACTACTGCC TGCATAAGGT ATCCCATTTA TAG CTTTC CA 8 52 0 

ATGTGTAGCT GGAAAGTACT GTTTATCGT 854 9 

25 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGCGTGTAG TGACTTACGG nTAGGAAACT ATGTATCCGA ATGATTTATT GAGAC CAAAA 6 0 

AGGCATTAAA GTCCATTGAA ATATCnGGTA GCGmGTTGGT ACgTGGACGT GGGGGCCCTA 120 

GATGTATGAG TCAACCATTA TTCAGAGAGG ACATTTAACG TAATAAATTA TAGAmACGAG 180 

GGTGAAAATA ATGACAGAAA TTCAAAAACC GTATGATTTA AAAGGCAGAT CATTATTAAA 24 0 

AGAAAGTGAT TTTACCAAAG CAGAATTCGA AGGACTTATT GATTTTGCAA TTACATTAAA 3 00 

AGAGTATAAG AAAAACGGTA TTAAGCATCA CTACTTATCT GGAAAAAATA TTGCACTACT 3 60 

ATTCGAAAAG AATT CGACGA GAACGCGTGC TGCGTTTACA GTTGCGTCTA TTGATTTAGG 42 0 

TGCGCATCCA GAATTTTTAG GAAAAAATGA TATTCAATTA GGCAAAAAAG AATCTGTAGA 480 

GGATACTGCG AAA 3 T ATT AG GTAGAATGTT CGATGGTATT GAATTCCGTG GTTTTTCACA 54 0 

ACAAGCTGTT GAAGATTTAG CGAAGTTCTC TGGTGTAC CG GTGTGGAATG GATTAACAGA 60 0 
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TCTAGAAGGA ATAAACTTAA CTTACGTTGG AGATGGACGT AATAATATTG CGCATTCATT 720 

AATGGTAGCA GGTGCTATGT TAGGTGTTAA TGTAAGAATT TGTACACCTA AATCATTAAA 780 

5 

TCCAAAAGAG GCATATGTTG ATATTGcAAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT 84 0 

CATGATTACG GATAATATTG CAGArcCAGT TGAAAaTwCm GATGCTATAT ATmCAGATGT 900 

TTGGGT AT CG ATGGGTGAAG AAAGTGAATT TGAACAcGTA TTAATTTATT AAAAGACTAT 96 0 

w 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 102 0 

TTACCAGCAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA 10 3 0 

15 TTAGCTGAAA TGGAAGTTAC AGACCAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 114 0 

CAAGCTGAAA ATAGAATGCA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 120 0 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA 1260 

20 CCTCATATTG G T ATT AAAGG AHAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 1320 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 13 8 0 

ATGTCTGATA TGGGCGGTAA AGCCGGTGGA TTAGCCATTA TTATTGGTTG GATTATTACA 14 4 0 

25 

GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG 15 00 

CTAGACGGTG GTATTTATAG TTATGmTCAA GCAGGATTTG GCGATTTTGT AGGATTTATC 15 6 0 

30 AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG 16 20 

ATGTCAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 16 8 0 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 174 0 

35 GCAGCATTTA TCAATAGTAT TGTTACTGTT GCAAAGTTAA TACCGATTTT ACTTGTAATC 1800 

ATATGCATGA TAATTGCATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 1860 

TCAGAGGGTG TATTGCCATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 1920 

40 

CTAGTGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC TAGTAGAGCT 19 8 0 

nAAAATGAGA AAGATGTAGG TAGTGCCACG GTTATAGGAC TTATATCAGT TTTAATTATC 204 0 

TATyTCTTAT TAACTGTATT AGCTCAAGGC GTGATTTTGC AAAAT CAT AT TTCGCAATTA 2100 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG ATCTACACTT 2160 

GTAAATATTG GTTTAATTAT TTCGGTACTA GGTGCATGGT TAGGATGGAC ACTGCTTGCT 222 0 

50 GGTGAATTAC CTTTCATTGT TGCAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 22 80 

■.■--r^".^ rzr.nrfrrrTGT AAATGCACTG CTTATTACCA ATATATTAGT ACAATTATTT 2340 
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CGACAGCAAG CAACTACTAA ACAATGGACG ATTGGTATCA TAGCCTCAAT TTATGCTATA 2 52 0 

TGGCTTATAT ATGCAGCAGG TAT CAATT AC TTATTATTGA CGATGTTACT TTATATTCCA 25 8 0 

5 

GCTCTTCTTG TTTATACaAT CGkTCmAAAG rATwATCAGa CACGTTTGAT TAAATCAGrC 264 0 

TATATTCtTT TTATGATTAT tATCGTACTT GCAGTTATCG GGTTAATTAA GTTATTGATG 270 0 

10 GGAACGATAA ATGTTTTTTA AAAGGAGCGA CAAAAATATG AAAGAGAAAA TTGTCATTGC 276 0 

ATTAGGCGGT AATGCGATAC AGACAACAGA AGCAACAGCT GAAGCACAAC AAACAGCTAT 282 0 

TAGATGTGCG ATGCAAAACC TTAAACCTTT ATTTGATTCA CCAGCGCGTA TTGTCATTTC 288 0 

15 ACATGGTAAT GGTCCACAAA TTGGAAGTTT ATTAATCCAA CAAGCTAAAT CGAACAGTGA 294 0 

CACAACGCCG GCAATGCCAT TGGATACTTG TGGTGCAATG TCACAGGGTA TGATAGGCTA 3 000 

TTGGTTGGAA ACTGAAATCA ATCGCATTTT AACTGAAATG AATAGTGATA GAACTGTAGG 3 060 

20 

CACAATCGTT ACACGTGTGG AAGTAGATAA AGATGATCCA CGATTTGATa ACCCAACTAA 3120 

AcCAaTTGGT CCTTCTTATA CGAAAGAAGA AGTTGAAGAA TTACAAAAAG AACAGCCAGA 3180 

25 CTCAGTCTTT aAAGAAGATG CAGGACGTGG TTATAGAAAA GTAGTTGcGT CACCACTACC 3 24 0 

TCaATCTATA CTAGAACACC AGTTAATTCG AACTTTAGCA GACGGTAAAA ATATTGTCAT 33 00 

TGCATGCGGT GGTGGCGGTA TTCCAGTTAT AAAAAAAGAA AATACCTATG AAGGTGTTGA 3 3 60 

30 AGCGGTTATA GATAAAGATT TTGCTAGTGA GAAATTAGCA ACGCTGATTG AAGCAGATAC 34 2 0 

CTTAATGATT CTTACGAATG TAGAAAATGT ATTTATTAAC TTTAATGAAC CTAATCAACA 34 8 0 

ACAAATCGAT GATATTGATG TAGCAACACT GAAAAAAtAC GCGGCACAAG GTAAGTTTGT 3 54 0 

35 

GGAAGGATCG tGTTGCCAAA AATAGAAGCT GCGtACgtTT GTTGAaAGtG GGGaAACCAA 3 6 00 

A 3601 
(2) ."INFORMATION FOR SEQ ID NO: 7: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
45 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 6 0 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 12 0 

AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 18 0 
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TTATTGGTGA GG AAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 3 00 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 3 60 

5 GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 4 20 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 4 80 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 54 0 

w 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 5 73 
(2) INFORMATION FOR SEQ ID NO: 8: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

25 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 12 0 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 180 

TTCATmTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 24 0 

30 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 3 00 

ACAATTACTT CATCATGGAC ATGGCCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 3 60 

35 ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 4 20 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACTACCCC AACTATTTTC AC CAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 54 0 

40 GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 6 00 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 660 

ATGTTAGGAT TTGCGTTACG CCAACTATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 72 0 

CCCATTTCCA ATGCACCCAT TGCTTTTAAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 7 80 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 84 0 

so ACA7TAAACA TTTGAGAAGC CGATGCTTCA TATATCTTTC CGTGTGTGTT GAATACATCT 90 0 

^^r-^ HTTCTTTTGC ATACCATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 96 0 
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10 



15 



20 



25 



30 



35 



40 



45 



AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGTTAA ATTCTGAAGT 114 0 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 

ACCCGTTCAT CACTGCACAT C 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 6 0 

AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 180 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 24 0 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 3 00 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 3 60 

TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 4 20 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 80 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 54 0 

TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 60 0 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 66 0 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

TACCTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 730 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 84 0 

TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 90 0 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 102 0 

50 AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 10 8 0 

CAATAAGAAA 10 90 
(2) INFORMATION FOR SEQ ID NO: 10: 

55 
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(A) LENGTH: 904 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



15 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 10 • 

to TTAGGACTAT TTTATCATAT TCATTTAAAT TACGGCTAAA AATTTTAAAA ACGGGGATTA 6 0 

ATATATGGAA TTAAGCTATG AAAGTTAATT GATACTTGCA TTTTACGCTG ATTTATATAA 12 0 

GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 18 0 

AATGAATAAA CAAAATAATT ATTCAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 24 0 

AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 30 0 

AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 36 0 

AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 4 20 

AACAGGTATA CATAAATCAG GTAAACCGAC AGTCGAAGTT ATCTTTACTG TTTTACATGC 4 80 

AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 54 0 

TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA ATCCATCGAG ATGGTAATAT 60 0 

ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 66 0 

AACTAAGAAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TTAAAG CAT C 72 0 

TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 780 

AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 84 0 

TTATGAAGAA GGGAt CaAAG rGTTgTTAGT atGTCCAaTG Ar GG AAAAGA AGTTTTGCCT 9 00 

GACG 904 
(2) INFORMATION FOR SEQ ID NO: 11: 



25 



30 



35 



40 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

so 

GATTTCTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 6 0 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT ATTATTTGAC TATGTTGGAT TAGGCATCTA 300 

GTCCTATAAT ATCACTGACA TTGTCAAAAT GATGATCTTT TAAGTAACGT GCGATGCCTT 36 0 

5 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT CAATAACAAG TGATGAATAA ATTTGAATAA 420 

GTGACGCACC GTGACGCATC ATTTTGATTG CATCTTCAGT ACTGAATACG CCGCCTGTAC 480 

CTATAATTAA AAATTCACCA TTTGTTTGCT GATAAgCATa CTTAATCAAT TTTAAATTAC 54 0 

w 

GTTCAAATAA TGGACGACCA CTCAAACCGC CTTCTTCGAC TTTATTAGCA GAAGTTAAAC 60 0 

CATCTCGTTG TCGCGTTGTG TTTGCTAAGA TGATACCGTC AAATGTCTCA GTAATCGCTG 66 0 

15 GTAATAGTGC TTTTAAGCCA TCGAAATCCA TATCAGACGT TAGTTTTAAA TAAATTGGCA 72 0 

CTGTTACATC ATGTTGTTTT TTAAATGCTG TTAAAGCTTG GCATAACATT GAAAATTCAT 78 0 

CTTTATCATG GAAGTTTTGA AGATTTTCAG TATTTGGAGA ACTGATGTTG ACTGTGAAAA 84 0 

20 ATGAAACGTC GTGTTTAAAC GTATCAATAA CCTTTATATA ATCTTGATAA CGCGCTTCAT 90 0 

AAGGTGTCAT TTTATTCACA CCAACATTGA TACCAACAGG TACTTGATAA GCATTTTTAC 96 0 

GCAAATGACT TAGTGCTTTG TTCATACCAA TATTATTGAA GCCCATTCGA TTTATCAAGG 102 0 

25 

CGTCATCTTC TAATAATCTA AACATGCGTG GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 108 0 

TGATACCACC TAATTCTAAA GCACCGAATC CAAGGTGTTC CAATGCTTTT GGTACTTCGC 114 0 

^ AAGATTTGTC GAAACCAGCT GCTAAgCCAA TTGGATTGTC GTACGTATTA CCTTGTATCG 12 0 0 

TTTGTGATAA CGTTGGATTC TTATAAGTAA ATAGTTTATC GACGACTGGG AATAAAACCG 12 6 0 

GaAACTTTTG TaACGTTTTT AATGCATCGA TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 13 2 0 

35 TTTTGAATAA GAAAGGTTTA ATTAATTTGT ACATGAGTAT GCTCCTATTT CATTATATTT 13 8 0 

GAGGCTTACT ATCCTCAACT TAATATATGT GAAATATATT CTTTTAATAG ACTAGCATTT 14 4 0 

CCATACATAA TTTCCTAGTT AAAACTAAAA AGTTTTGAAA ATTGACGCAA gTTTGAATAA 15 0 0 

40 CGTTTTTAAG ATTAAATCAT CCTAATTAGG CAATATTATA GTATAAAGTA AGTAGATTGG 15 60 

AAGGTGTTTG TATGAATGAA CAATGGTTAG AGCATTTACC TTTAAAAGAT ATTAAAGAGA 16 2 0 

TTTCACCAGT GAGTGGTGGT GATGTAAACG AAGCATATCG AGTCGAAACA GATACGGATA 16 8 0 

45 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA AAGAATCATT TTATGCTGCA GAAATTGCAG 174 0 

GTTTAAATGA ATTTGAACGT GCAGGTATCA CGGCACCTAG AGTAATTGCA AGTGGCGAGG 18 0 0 

50 TTAACGGTGA TGCGTATTTA GTGATGACGT ATTTAGAAGA AGGGGCTTCA GGGAGTCAAC 13 6 0 

GCCAATTAGG GCAACTCGTA GCTCAATTAC ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 192 0 

GCTTCTCATT AC CTTATGAA GGTGGCGATA TTTCTTTTGA TAATCATTGG CAAGACGATT 198 0 
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GGCTATGGGA TGCCAACGAT AT CAAAGT AT ATGACAAAGT GCGACGTCAA ATTGTGGCGG 210 0 

AATTAGAAAA G CAT CAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

5 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 2 22 0 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 228 0 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTATCGTT 234 0 

w 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 24 00 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 24 60 

15 ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 2 52 0 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 264 0 

20 AGGAATGATA CAT A TTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2 70 0 

GTATAATTGA ATGTTTGAAT ATCATATATT GATACAGTTT CTAATAATTT TAAAATAATT 276 0 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 2 82 0 

25 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AAC CGGATAA 28 BO 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 294 0 

GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3 000 

30 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 3 06 0 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 3120 

35 GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAG CATTG AT AGATGAAAGT 3180 

GATGCGCTTA ATCATTCGAT AGATCAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 3 24 0 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 33 00 

40 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 3 36 0 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 34 2 0 

GAAAATAAAT GGAAACAATA GG AAG CATT A TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 34 8 0 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 3 54 0 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3 60 0 

so ATATAGATTC AG7TATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTT CAAGAGT 3 66 0 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 3 72 0 
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w 



15 



20 



25 



ATGTCATTCA TAATCATTTG AACTAAACGT AGCAGCCTTA AATTTTAAAA AAAGACACAT 3 9 00 

ACCAACTTCC GAAATGTAGA TGAATTCTCT ACAATAACGG AAGTTTTTCT TTTAATATTG 3 96 0 

AAATTTCTCA AGGATAGGTC TATACTTTAT AAATCGTAAT TATTACGATT TATAATCAAA 4 02 0 

AACAATAACT TGAAATAGAT CATTGAGGGA GTGTTAATAT GCAACATCAT AAAGTGG CT A 4 08 0 

TTATcGGTGC CGGTGCTGCA GGTATAGGTA TGGCCATTAC CTTAAAAGAT TTCGGTATAA 414 0 

CAGATGTCAT TATTTTAGAA AAAGGAACAG TAGGACATTC ATTTAAACAT TGGCCGAAAT 4 2 00 

CGACCCGTAC GATCACGCCA TCATTTACGT CTAATGGATT TGGCATGCCT GATATGAATG 4 26 0 

CAATTTCCAT GGATACTTCA CCAGCATTTA CATTTAATGA AGAACATATT TCCGGAGAAA 4 32 0 

CATATGCTGA ATATTTACAA GTGGTTGCCA ACCATTACGA GCTGAATATC TTTGAAAATA 4380 

CAGTTGTCAC AAATATATCT GTAGATGATG CATATTATAC GATTGCAACG ACAACAGAGA 4440 

TATATCACGC GGATTATATC TTTGTCGCAA CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4 500 

TTAAATATGG TATTCATTAT AGTGAAATTG AAGACTTTGA TAACTTTAAT AAGGGGCaAT 4 56 0 

ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT TTGATGCTGC ATATCAACTT GCAAAAAATG 4 62 0 

GCTCTGACAT CGCACTTTAT ACTAGCACAA CCGGTTTAAA TGATCCGGAT GCTGATCCTA 46 80 

GTGTTAGATT GTCACCTTAT ACACGTCAGC GACTAGGTAA TGTCATTAAG CAAGGTGCTC 474 0 

GCATCGAAAT GAATGTACAT TATACAGTTA AAGATATTGA TTTTAACAAT GGACAGTATC 4 8 00 

ATATCAGTTT TGATAGCGGA CAAAGTGTGC TTACACCTCA TGAACCAATA CTAGCAACTG 486 0 

GCTTTGATGC AACAAAAAAT CCAATCGTTC AACAATTATT TGTGACAACA AATCAAGATA 4 92 0 

35 TTAAATTAAC AACACATGAT GAATCGACAC GTTATCCGAA TATTTTTATG ATTGGTGCAA 4 98 0 

CAGTTGAAAA TGATAATGCC AAATTATGCT ATATCTATAA ATTTAGAGCG CGATTTGCAG 504 0 

TACCTGCACA TCTTTTAACA CAGCGGGAAG GcTTACCAGC TAAACAAGAT GTCATTGAAA 510 0 

ATTATCAAAA AAATCAAATG TATTTAGATG ATTATTCATG TTGTGAAGTG TCATGCACAT 5160 

GTTAGAAGTG AAATATGATA TGAGAACTGG GCATTATACG CCCATACCTA ATGAACCTCA 5220 

TTATTTGGTT ATTAGTCATG CGGATAAACT TACCGCAACA GAAAAAGCGA AATTAAGATT 52 8 0 

ATTAATCATA AAACAGAAAT TAGATATTTC ATTGGCAGAA AGTGTAGTTT CTTcGCCTAT 5 34 0 

AGCGAGTGAA CATGTGATAG AACAATTGAC ACTATTTCAA CATGAGCGAC GACATTTAAG 540 0 

ACCTAAAATA AGTGCGACAT TTTTAGCCTG GTTGTTGATA TTTTTAATGT TTGCATTGCC 54 6 0 

AATCGGTATC GCTTATCAAT TTTCAGATTG GTTTCAAAAT CAGTATGTGT CAGCATGGAT 552 0 

AGAATATTTA ACTCAAACAA CATTGCTCAA TCACGATATA TTACAGCATA TATTATTTGG 558 0 
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ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 57 0 0 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 57 6 0 

5 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG 5 82 0 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 5 880 

TGCGACATTA TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 594 0 

w 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 6 00 0 

CTTAGCGTTC CGCTACCTTA TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG 606 0 

15 CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 612 0 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACGCCAATTT TGAATGTTTT ATCACAAATA 6180 

TTTACACCTA TATTATCGTT ATT AGG CATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 624 0 

20 TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 63 00 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC 636 0 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 64 2 0 

25 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 64 8 0 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6 54 0 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6600 

30 

AATAG TT ATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 666 0 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 6 72 0 

35 GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 678 0 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6 84 0 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6 90 0 

40 TGCTTGTTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 696 0 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 702 0 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 708 0 

-15 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 714 0 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 72 0 0 

so CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 726 0 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 73 2 0 
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CGGAAAGGGT ATTTCAAAAG AAGACTATCA ATGTTTGGAA CAGTAGTGTT TTCAGTGGAA 7 500 

GAGAATGGTT AACATGCCTT CATGTATAAT AACGAGTTGA TTTGAACGTT TAAGCGTAAA 7 5 60 

TAAAAATAAG CTTGGTCAGC CATCAAATAT AATTTGAAAA CTGTCCAAGC TGTTTTATTA 762 0 

GAGAACAATC AATTAACCCC ACATATTTAA TAATACATCA GCAAAGCCTT CAGGTTTTTG 76 BO 

AATATAACCT AAGTGACCGC CTGGAATATC TACAATAGGT ATGCCAGTTT CTTTATTTAT 774 0 

ATAAAAGTTA ACATCTTGTG GGAAGGAGCC TCTAGAATCT GTCCCATTTA GTAGGGTGAT 78 0 0 

TTTATCGCTG TATTTTGTGA AATCATCCAA AGTAATATCT GAATGCGTAT ATTGTCTAAT 786 0 

TTCAAATTCT GACCAGAACA TCGTACGTTT GTACTGTTCT ATACGTCCTT CTTCAGTATC 7920 

AGCAGGTTGA GACATCATTT TTGCATCAAT TGGTGCGATA TTTAATGTTT CGCCAAATGT 79 8 0 

TTTCATGCCT TTTTCTAAGC CTTCTGTTAA AATTTGATGC ACAATGTCAT CATTTTTATC 8 04 0 

TTTCCAATAA GTACTGTCTG GTAAAAATGT ATTAATTGGT GGTTCGTGAA ATGCAATCTT 810 0 

TTTAACGACT TCAGGGTAAT CTTTTAACAC ATGCATCGCA ACGATTGAAC CTGAACTTGA 816 0 

ACCTAATATA TAGACAGGTT CATCACTTAA TGACTTTGCA AGTTCGGCAA TGTCCTGTGC 822 0 

GTCGCGTTTG ACACGATAAT CACTGTCAGG GTTTGAAGCG GAATCAGGGA GTGGTTCAGT 82 8 0 

TAACTCGCTT TCTCCATAAT CACGACGATC AACGGCTACA ACAGTAAAAT GGTCTTTTAA 834 0 

CTGTTCTGCA AGAGGCAGAA AAATGTCTCC GGTACCGTTT GCACCAGGAA TAAAGATGAG 84 00 

CACGGGTCCT TGTCCGACTT GGTGGTATCG TAATTTAGCG CCTTGTAATT CTAAAGTTTC 84 6 0 

CATATTCAAT GACCTCCATT TGTTAATTGT TAGGTGATAA ACCTAATAAT TTAGCACCAT 8 52 0 

35 TTGTATAACT TATTTTCTCT TTTTCTTCAT CTGTTAAACC CAGTTCATCT AAAAATACAC 85 8 0 

CTAATTTTTC AGGCTCAATA TATGGATAAT CAGCAGCATA AAGAATTCTA TCAATACCTA 864 0 

CTTCTTTCTT GACTAAATCA AACTGTGGCT TCGTTAACAT GCCACTCGGT GTGATATAAA 8700 

AATTATTTTT AAAGTAATAG CTTACAGGGT GGTTCAAATG TTCAGCGAAT AAAGCTTCAT 87 60 

CCATACGTTC TAAGAAGAAT GGGATAAACT CACCCCAATG TCCAATAATC ATATTTAACT 882 0 

TTGGATAACG ATCAAAAATA CCAGATAATA CTAGATGTAT TGTATGAATG CCGACATCAA 88 8 0 

TGTGCCAACC ATAACCAAAA CAAGCAAATG TTGCCGCAGT TACTTCAGGA TAATTTCCTT 8 94 0 

TATAGTATGA TTGATAAATG TCACTGTTAA CTGGCGCGGG ATGTAGATAA ATCGGTACGT 900 0 

CTAAATTTTC AGCTGTTTTG AAAATAATGT CATATTTGTC TTGATCAAGA AAACCATCTT 906 0 

GTGCACGTCC CATAATGAGC GCACCTTTGA ATCCTAAATC ATTGATGCAA CGTTCGAATT 912 0 

CTCGCGCTGC GGCTTCAGGC TCATTGATAG GTAAAGTTGC AAAGCCTACA AAGCGATTGG 918 0 
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TCTGACCAAC CAAATTTGAA GGAGAACCAT TTCCATAAGA TAAGACTTGA ATTTGAACGT 93 0 0 

CTTGATTATT CATAAATTGG ATACGTTCAT CATGATGTGA TAATT CGTCG GCATTTGTAA 93 6 0 

5 

AACCTGTCTT TTTTTcAAGG CCTTCTAACA TTACTTTCAT CGGTACACCT TTAGG AT CTG 94 2 0 

CTGATATCGC ATTCATCGTT TCTTTTTGAA TATCTTCAAT GACATAATGT TCTTCAAACG 94 8 0 

TAATACTTTT CATTTACTTC GCCTCCATAT TGTATTGCAT GTTTATTGCA TCTATTGCAG 954 0 

10 

AAGCATTTTT TATATACCTC TAATTTCAAT GTTTGTAACA TAAAATTGAT CTACCAAGGC 9600 

ATCTCTCCAT CGCCATTAAT AAATGTACCT GTTGGGCCAT CTGCACCAAT CGTTGCTAAT 9660 

is TGAATGATTG GCTTGATTCC TTCAGAAACG TGTTTGGAAT TATTACTAAA ATCACCAACT 972 0 

AAATCAGTAT TTGTAGCGCC TGGATCAGCA GCATTGATTT GCATGTTAGG TAATC CTTTA 978 0 

GCGTATTGTA GCGTTAGCAT TGTTACTGCC GATTTAGACG AACAATAAGC TAATGAATTC 9 84 0 

90 

ACTTTAGATT CAGCTOTTTP GGGGTTTGTA ACCATTCCAA ATGAACCTAA ACCACTTGAT 990 0 

ACGTTGACGA CAACAGGTTG TTCAGATTTT TCTAAGAGAG GGACGAATGT ATTCATCATT 996 0 

CGTACGATAC CGAATACATT CGTTTGATAT ACTTCTTCAA CGTCACGAGG TGTCAATTTG 1002 0 

25 

GAAGGTGCTG AAAATTGACC AGATATACCT GCATTGTTAA TGAGGATATC AAGACGGCCT 10080 

TCTTTTTCAG CAATCATGTT ATAAGCATTT TTGACTGAGT AGTCACTTGT AACAT CTAAT 1014 0 

TGTACATAAT GAACACCTAA TTTTTGTGAT GCTTGTTGTC CTCTTACATC ATTCCGAGAA 102 0 0 

30 

CCTATATAAA CTTTGTAACC CAATGCTTTA AGTGCCTCTG CACTTGCATA GCCTAACCCT 1026 0 

TTATTGCCTC CTGTGATTAA CACAATTTTA GTCATTACGT CCCACCTCAT CTAAATAAAT 10320 

35 GTTTAATAAA TAATTTCTGT ACGCTTCAAT TGAAATATGG CGATGCTCTA TTTGGAAGGC 10380 

AAATACACTA GTTGATAATG ATTGCAACAG CATATCTGTT TTGAAtTCGT GTAAGTGTCG 10440 

TCATCGCTTT TAAATAAGTC ATAATAAAAA TCAAATAATT CTTGATAAAA TGCGCTTTGG 1050 0 

40 

TAAAAACGTA ATTTATTGTT GCCTGCTTCA ATACATTGCA GTAGTGCCTT ATTATCGATT 10560 

TTAAATTGTA AAAGATAATC TAACGACACT TGCATAACCT CATAATTAGA ATGATAGTCA 10620 

TCTTTAATTT GCTTAAAATG AGTGATAAAA ATATCAAGGT CTCTTTGTAT GACGTAGTAG 10680 

45 

CATAAATCGC TTTTATCTTT GAAATGTCGA TACAATGTCC CCATACCGAT ACCTAGTTCT 10 740 

TTAGCAATAC GATTCATACT AATGTTTTCA ACGCCTTCTT CATCAAAAAG TTTGTGCGCT 10800 

50 ATTTCTTCAA TTCGTTGCCT ATTCTCTTTT GCATCTTTTC GCATGATTAC ACCTACTTAA 10 860 

AATTCTCTAA AATTGACAAA CGGATAACTC TCCGTTTATT ATAAAACGTG TTAAGAAAGT 10 920 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 11100 
GTATTAGCTA TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 11160 

5 

TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 112 20 
GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 112 71 

(2) INFORMATION FOR SEQ ID NO; 12: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
75 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 

20 CAACCCGTTC AGAACAAAAT AAAAACCGTA CAATTTTATC ATCTTAATGA TTATTGTACG 60 

GAAAAACTTT TTTACATCAT ATCTGCATGT GCATAATCGA TATCGGTAAA TTTATTATAT 120 

TGTTTCATAA AATGTAACTT AACTGTGCCT GTTGGACCGT TACGTTGCTT AGCAATGATA 18 0 

25 

ATTTCAATTT CACCGTTTTC ATCATTCGTT TGTGGCTCGA AACCACCATC ATCGTCATCA 24 0 

TCTTCATCGC CGCCACGGTT ATAGTAATCA TCACGGTATA AGAATGCAAC GATATCGGCA 3 00 

TCTTGCTCAA TCGAACCAGA TTCACGAATA TCACTCATCA TTGGACGTTT ATCTTGTCGT 3 60 

30 

TGTTCAACAC CACGAGATAA CTGACTTAAT GCGATAACTG GACATTTTAA TTCACGGGCT 420 

AATGCTTTTA ATGTACGAGA GATTTCAGAA ACTTCCTGTT GTCTGTTATC GGACGCACGT 4 80 

35 GAACCACTAC CTTGAATCAA CTGTAAGTAG TCAATCACAA TCATGTCTAA GCCATGTTCT 54 0 

TGCTTTAATC GACGACATTT AGAACGTAAA TCATTAATTC GAATACCCGG TGTATCATCA 600 

ATAAAAATCT TCGTACGTGA TAATTTACCT ACCGCTATAG TAAAACGACT CCAATCTTCC 6 60 

Af) 

TCAGTCATAG TACCCGTTCT TAAGCGGTTT GAGTCAACAT TTCCAGAACT ACAAATCATA 720 

CGTGTGGCTA ACTGATCAGC ACCCATCTCT AGCGAGAAAA TACCAACTGT ATACATATCT 7 80 

TCATGCGTTG CAACTTTTTG TGCAATATTA AGTGCGAACG CAGTCTTACC TACAGATGGA 84 0 

45 

CGCGCTGCAA GGATAATTAA ATCATTT CGG TTGAACCCTG CTGTCATTTG GTCTAAATCT 900 

CGATATCCTG TAGGTATACC TGGTGTTTGA CCACTATTTT GATCAAGCTC TTCAGCTGTT 960 

50 TCATACACTT GTCCTAAGAC GTCTCGAATG TCTTTAAAGC CATCGCTTTC ACGAGAAGAT 102 0 

GATAGCTCTA AAATTCGACG TTCTGCATCA CTTAAAATCG CATCTAGTTC AAGTTCATCA 10 80 

TTATATCCAT CATTGGCAAT ACTATCTGCA GTTTGAATCA ATCTACGTTT TAATGCATGC 114 0 

55 
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TCTGCAAGAT ATTGCGGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 12 6 0 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCATTTA AGTGCATCAT TGCACGGAAA 1320 

5 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 13 80 

ATCAATTCTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATTGTTATGC 144 0 

GGCATTTGAT TTTGCTCATA CATTCTATCC ATGAATGGTT ACACCTCTTA TTTCAATCCA 150 0 

10 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC 156 0 

AGGTACATTC GTATATCCTA GGGAATGAAT TCCATTTGGT AAATCCATTT T ACGTTT AT C 162 0 

15 AATTTTAATA TCATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTGTAC TTACTGACCC 16 8 0 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 174 0 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

20 CTGTAACTCT AATTGTTTAA GGTTACCTGG TGTTGCTTCT ACAGCATAAT TCTTTTTCAA 186 0 

TAAGAAGTTA TTTGCATAAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTACC 192 0 

TTTACCTTTA ACATCTTGTG TAAAAATTAC TTTCATGCAT CTTCACTCCT ACTTAATTGT 198 0 

25 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 2 04 0 

GTTGCCGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

30 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2 22 0 

GGATGATAAA TTTTATCGTC TGAACCATGC GcAATGGCTA TGCCATTATC TTCAACTTTT 228 0 

35 ACAGTTCGAA TTAATTCAGA TCGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 23 4 0 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTGCATC GAATGTTCTT 24 0 0 

GATCCTGTTC GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 24 6 0 

40 GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2 520 

TCAGCTGTCG AACTTGCGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 2 58 0 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 2 64 0 

45 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 27 0 0 

ATATCCCAAG CATCATCTGA TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2 76 0 

50 TCGTTCATCA CGCGTCGTAA TGTTGGATCA ATGTCAGTCT CATTTAATAC GATGTATGCT 2 82 0 

TCTAAATTAT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 2 88 0 
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C CAT AG AAA C GCACATTACC ATTAATACTT 
AATGCTAAGT CTAGGCCTGA TTGTGATAAT 

5 

TCACCAACAC CGATACTTAA TGTTAATTGG 
TGACTCAAGA TATCAAATTT AGATTCTTCT 
GCTACGAATT GATCGGAACT GTATCTTTTG 

w 

CTAATGACAC GCGTTACCAT TGAGTTGATT 
GTAATCTCAT CGTAGTTATC TAAAAATAAT 

r5 AGTTCATTTG TTTGTACTTG TTCAGTTATA 
GAATAACGTA CTTGGAAATG ATACTGATTA 
AATTGCTTTA AAATGTTTGG AAATACTTCA 

20 ATATGATCTG TCATAAATTG GTTAACCCAT 
ATACCAATTG GTAAATGTTT GATTGCTTTA 
CCATCTACAT AACTATCCAT TTTCATTAAA 

25 ATCATCACGA CAAGAACGAT AGATGCAATT 
ACACCCATTA AAACAATTG C TGTGATGATC 
TTAGTGGACT GCCGATTCAT TATTCCACCT 

30 

TTCGCTTCAA ATTCAAACTT AAATCGATAA 
GTGTCAGTAT TGTACCGATA ACCAATAGTA 

35 CTTTACCAAA GAAATGAATA ACACTTAAAC 
GTTGGAAGTT TAAAAGAATG CTCTGGAACA 
TGATAACAAT AATGTATATC CATAATAAAA 

40 TAAATACAGG TGTAGCGATT TTAAATTTTC 
TTAAGACGAT TAAAAATGTA ATGATAATGA 
TAAACCCTTC TTCTAATATT TGGGTCATAT 

45 

CATGTAATGT TTGCTTGAAA GGTTTTACTA 
TTTGTAGTAA CATAAAAGCG ATTAATGAAA 
ATATTCTTTC TTTAGACGTT CTTTCTTTGA 

50 

AGACTAATAT GATGGCACTT AAAACGAAAG 
TAATAAGTGC ACTAATCCCG AAAGATTGTA 

55 



TTAATTGCAA CTTGGTCGCC ACCGCGTCCT 3 06 0 

TCACCTAAGT CGATTAAATT TTCAGTACCT 312 0 

GCACGATAAC CAACACTTTT TTCACGTAAT 313 0 

AAGTCAGCTA ATATTTTTTG ATTTAAATAG 3 24 0 

AAAAATATAT TATACTCAGT TGCCCATCGA 3 3 00 

TCCGAACGCT GCGTATCATT CATATTTTGC 3 36 0 

GTCGCAATGA TTGGTTTAGA ATTTTCATAT 34 2 0 

TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 34 8 0 

TATTCTATTT cAACGGATTT CACTCTATCT 3 54 0 

TTTACAGATT CAGAAATGAC ATTCGCTTCC 3 6 00 

TCGATGTGAT CATTTTCATC TAAAACAATG 366 0 

TTATTTGTTG TTGAAATTTG AGCACTCAAA 3 720 

GCTTGTCTGA ATAAAATGAT GCTAACAATA 37 8 0 

AGTGCTATAA GACTATTAAA GATAAACCAT 3 84 0 

ATGATGACAA ATGGTATTAG TAAAGCTTTC 3 9 00 

CTATTCACTT TTTAGAATTA TTTTTCATGA 3 9 60 

CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 020 

AAATCGTTAC TGCATTCGGC AAACCTTTCG 4 0 80 

CTTGAATATA CATTACTAAT GATAACACAA 414 0 

CACTCGGTTG ACCTGTAAAT AATAAACATA 42 00 

TACCGCTCAT TTGCCACGCG AAAAGTGGCT 4260 

GTAAAATCGG AAATGTAACG ATTAAGTTAA 4 32 0 

TGAAACCTGG TAATTGAACG GTCGCTTGTC 43 8 0 

TCGCATCGGC ACCGCTCATC GTAATCGCTT 44 4 0 

TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4 50 0 

TTnArCTCAT CGCTACTGTT GTTACGTATA 456 0 

GCAATTGACC AATAATTAAA CTTGCAATTA 4 62 0 

TATTACCTAA AACAGTTGTT ATAATTACTG 4680 

TTGATTTATT CCATAAAACG ATACCTGGTA 4 74 0 
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CAAATACCAA CGCAATCGTT GCAATTATTG TTGCTTTAGG TTGTATTTTT GAAAACACAT 4 860 

AAGCCACTCC CATATTTTTA ACTATAGCTA TTATTTTAAC CTCTTTAATG AAAATTAACA 4 920 

5 ATTTATAGAT TGTATGCTTC TATTTCATTT AATTGAATAA TAACTTTCAT GTTTTATAAG 4 930 

TAATTAACAT ACTCATTTGA ATCGCTTTTG TGTGCTTTCA TTTTCAACAT GATTATTTAA 504 0 

TCCCACTACA TAGCAATCAA GCTTGATTTA GATTTACAAT ACATTTCCAC TCTCATGTAC 5100 

w 

TCTAGATGTT TTTGAATATG ATAACTGTGA TTTAGTGGCT TCATTCTTTG AAAATATATA 5160 

TTATTACTTA CGCTTAAAAT GCTTTAAATT TAAGAAATGA TATAAGTTAG GTGCCCAGGT 5220 

ACTAAAGTTT AGTAGGaATC CATCATGCCC AACATTATCA GGCACGAAGA AATGACGATG 5280 

75 

ATATTTAAAA CGTTCACCTA ATGCACGAAC TTGATCATCC GGATATAGCA AATCATCTAT 534 0 

GAACCCCATC GTTAACACTT TTGTTTCTAA ATTTTTAAAA ACATGCGTTA CGTCTGTGCG 54 00 

20 ACCTCGGTCA ATGTTGTGAC TATCCAATAC ATCTAGCAGT GTCAGATAAC AATTCAAATC 54 60 

AAAATGTTCT TTAAATTTAT TACCTTGATG TTGTTGGTAT GCGACTACTT CATCCGGCGT 5520 

AAAACGTTCA TCATAACTTT TTGATGATCG ATATGTCAAA AAACCTAATT GGCGTGCAAT 558 0 

25 ACTTAGACCT TCCTTACCAC CAAGATGAAT GGCTTGCCTT GCAATTTCAT TGAAAGCTCT 564 0 

ACTATAAGAT GATGTTCGAC TTGTTGCAGC AAGGATAATG GCTTTATCTA CTTCAAACTG 5700 

TTGATTGTAG AG TAG TT CCA TTGCTTGCAT ACCTCCAAGA CTTCCCCCTA TTAAAATATT 5760 

30 

AATCTTATCA TAACCAAGGG CTTGTATACC TCGTTCATTC GCTCTGACTA TATCTCTTAA 5820 

TGTTAATTTT TTAGGAAAAT GAGGGTCGTT TAAAGGTGAA CTTGAACCGA AAGGACTACC 58 80 

AATAACATCA AATGTTAAAA ATTGATAATC GTGAATGGGT ATATATCCCC CAT CAAT AAT 5 94 0 

35 

TTCTCGCCAC CAACCCGGAT AATCATCTGT TCCATATGTT AAATGATTGC CAGTTAATGC 6 000 

ATGACAAACT ACAACTAATG GTTGTCCATG ATAACCGACA TGCTCATATC TCAAACGCAA 6 060 

40 GTnATCTATG ACTTCCCCAG ATTCTGTAAT AAATTCCCCT AAATTTAAAG TATCTACTGT 612 0 

GTAATTTGTC ATTGTTCTTT CCTCCTTAAA CAAAAAAACT TCTCACCCTA TTGAAAAGTA 6180 

AGAAGTCTTT ATACTTATCA TTCGAGTAAC TCGTTGGTTT TAGCACCGTG CTATAAAGTC 6 24 0 

4t> GGTTGCTGAA GTATCACAGG G 62 61 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH : 1222 base pairs 

(B) TYPE: nucleic acid 

STRANDEDNESS : double 



10 



15 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGCGATTAA CTCTGGAAAT ATCTTTTCCA TATTTACGTn TTAAATTATT CAGCAAATTC 6 0 

ATACGAGaTT CATACTCGTT yAACACTTGT TCGTCGAATT CTGTATTAGC CATTTCATCA 120 

TATAACTCAT GTTTTGCATC TTCTAAAATG TAGTAAAATT GATCAATATC TTCTTTTAAT 180 

TTGTCATATT TGTTTGGAAC TATATCGTTT ATTGTTAACA AATGGTTGCT TAGTTCATAT 24 0 

AAACGATCAG TGATAGCATT TTCATCCGTT AATGTCATAT ATGCGTTATT AAGCGCTAAG 3 00 

CTTAATTTTT CAGAGTTTTG AATGCGTTTA ATATCTATTT CAAGTTGCTC TATTTCGCCT 360 

TCTTTTAGAT GTGCTTCAGA CAATTCTTCT AATTGGAATT TCATTAAATC TAAACGCTGT 42 0 

AGCAATGCTT GGTCTGCTGA TTCTAAATCT TCTAACTCTT GCTTTTTGGC TTTATAATTT 4 80 

TGAAAAGTTT GGTGATATTT ATCCAACAAA TCTTGATAAC GTGATTCTGC GTAATTATCC 54 0 

20 AATAATGTTA AATGGTATTT TTGTTTCAAC AAAGACTGCG TTTCATGTTG GCCATGAATA 6 00 

TCTAATAATT CTTGCATAAC TTTTCGTAAA TCTTGTAAAG TAACTGTTTG ATTATTAATT 6 60 

TTACAAAGAC TTTTACCAGA GCTGAAAATT TCCCGTTTAA CTAATAAAAA ATCTTCATCT 72 0 

ACATCAATAT C CAT ATTTTT CAATATATGT ATAGCATCTT TACTCTCGTC AATATCAAAT 7 80 

ATACCTTCGA TGACAGCCTT TTTTTCACCA TGTCTTACAA AATCAGATGA AGCTCTCATT 84 0 

CCAATTAATT GTCCAATTGC ATCTATAATA ATTGACTTAC CTGAACCCGT TTCACCACTT 90 0 

AAAACAGTTA AACCATCAGA AAATTGAATT TCTAATTCTT CAATAATAGC AAATTGCTTG 960 

ATTGATAAGG TTTGTAACAT AAACTCATCG CATCCTTATA ACAAATTGAA AATTCTTGAC 102 0 

TTGATTTCAT CACTTGCCTC TTTGCTTCGA CAAATAATTA AACAAGTATC ATCACCACAA 108 0 

ATTGTGCCTA GTACTTCTTC CCAATTGATT TGGTCTAATA TAGCTCCAAT AGATTGTGCA 114 0 

TTACeAGGTA TGTTTTTAGA ACAAGTAAAT TATCAGTACC ATCTATATTA ACAAAGGAAT 12 0 0 

40 CCATTAAATA ACGTCCCAAT TT 1222 
(2) INFORMATION FOR SEQ ID NO: 14: 



25 



30 



35 



45 



50 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 6 0 
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TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 180 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 24 0 

5 TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 3 00 

GATACTATGC AATTGCTCTG CTAACTCAGG TGTTACAGCT CGGTTTAATG CAACAATTCC 360 

ACCAAATATT GATTGACTAT CCGCTTCATA CGCATGTTGA AATGCTTGTT CTATCGTGTC 420 

w 

ACCGATACCA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 4 80 

CTTTTTAACT AAAGCTAGTG TAGCATCTGC ATCTTTAATA TTGTTATAGC TTAATTGTTT 54 0 

15 CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTGCTTAGCA TTCGAAGTTC TCACAAAATA 600 

CGCTGATTGT TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 66 0 

TACAATCGCT TCATCATATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 72 0 

20 ATATGACTCA TCTAACGAAT CGTTTCTTAA TCGCGTCAAT ACTTCTTGAT AATCTGCCGG 7 80 

ATGTACAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 840 

ACCAATATCA ATATTTTCAA TTGCTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 90 0 

25 TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 960 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AGCTAAAATG CCACCATGAA CAGCCGGATG 102 0 

T 1021 

30 

[2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 759 base pairs 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 15: 

TCATTCACTC CTAAA7TGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 6 0 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 180 

AACTCTCTAG GCTTCATATA ATTTAACGTG CCAGAAATTT CCCATTTAAC CAATGTAAAG 24 0 

AAATCATTCG ATACAATGTG TGTA CACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 3 00 

so 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 36 0 
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ATAAAAtAGa ATTCyCCAGG kTTTACt TTA 
TGTTGAACAT GATTCGCAAC TTCTTCTCTA 
5 GCATCTTCTT CTGCATCTAT AATATACCAA 

TGCTCATAAG CATAAGAATT ATCAGGGTGC 
ACTATTTTAG TTAGAAGCGG AAAATCTTTG 

10 

TCTGACCAAA TACGGTCTAA TGTTTGACCT 
CCATTTGGAT GTGCTGACAC ACACCAACAT 

15 TCCAAACTCA CTTAGACGTT GACCGCCCCA 
TAATGGCATT GTTGCACCTC CATTGTGATT 
TCCATTATAT TTTGATTTTG TTCTCATTTA 

20 TAACTATTAG TGATTGTACC ATATTTACTA 
CACTTAAATT TACAGTACTT TAACATTTTC 
CTTACATTTG TACATATTTC CCTTTAAATT 

25 CTTTAATAGT TGTGCCATAC ATTGTTCAAA 
TT ATT CAT AC TTATAATTCA TCATTTTCAA 
TTCAAATCAT ATTTACTATC CTTATTAATC 

30 

TTTAATGTCC TGATCACCAC TAATAATTTG 
GACAATTTCT TTTAATACTG TCGCAACATC 
ATATTGTGCA GCTTCTATCT TTCCAGATCC 

35 

AATTGTATAA TTCAAACCTG nAACGTCTTA 
TATATGGCTT TAAATCACCG CTATCATCAA 

40 C CATGACAT A GTGTTTAATA TTGGCCTCTT 
CTAAATCGAC AATAATTGTT TTATCTGCAC 
TAACTTTATC GAATGGTTTA AACGTCTCAG 

45 CAACAAGAAT TGCTTTCATA CCTTGTGATT 
CACCAGCAGT AAATGGTACA TTTTCTTTTG 
CGCCATTAGC ACCTATAACC AAAATATTCA 

50 

ATGCCATACC ACTTTATGAG ATATGTAAAA 
ACTACTGGGA ACGTATTAAA TTAATATATG 



AtatATCyAA gTAtCGaCtC tATCGTTCCG 54 0 

GACTCTGCTA ATGTCCCtAT AACTATTTCT 6 00 

CATTCAGATT TGCCATATTG CCCgTTTTCA 660 

ACATGAATAG AAAGTGATTC TCTTGCATCC 72 0 

CTTGGGAAAT CACCAAACAA TTCACGATGT 780 

TGATATGGTC CATTAATAAT CTCGCTCGTA 84 0 

TCCCCCAGTT GTATCATTGT CTAATTGATA 900 

TAATTTTGTT TTTAAAATTG GTTGTAAAAA 960 

AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 102 0 

CATCGTATTA TTAACTTCCA CATTTCAAAT 108 0 

ACATTGCAGT ACTGCCAATT AAAAGnGCTT 114 0 

AAAAATTTAT AGCATAGAGA TTATATCTCT 120 0 

TACTCGCCCA TTATACCAAT TAATAaACAA 1260 

TTCTTTGTAA AACGCATAGA CAATACGTAC 132 0 

AAAATAACGA GTTACGAAAA AGTAACCCGC 13 8 0 

CGTTTCATTT TCAAATTGAG TTAAAGCATC 144 0 

AAACTCTTGG TGATTAAAAT GATTGGATGT 150 0 

TTCTCTAGGA ATTTCACCTT TACCATCAAA 15 6 0 

TGCTGCATTT GTAAGTGCCC CTGGATGTAA 162 0 

AATAGTCATC AGCGTAATGT TTAGCTATTG 16 80 

AAGCCTGACG TCTCGAATCA TATGTTGAAA 174 0 

TACTCGCAAT CATTGATTTA ACAGCACCAT 18 00 

CCGTGTTCCC TCCAGAACCT ACTGAAAAGA 18 6 0 

TTAAAGTCTC TATTGAATCA TTTTCAACAT 1920 

TTAACGCATT AAGTTGATCT GATTGCCTAA 19 80 

CTAATTGTTG CACTAGTAAC GAACCTACAC 2 04 0 

TTTACAACAC TCTCCTATkT ATTATTCTCT 2100 

CTTGTTACAA CTATAAAAAT CAATTGACAT 2160 

AACAAATATT CATATGAAAG GATTGTCATA 2220 
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tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 234 0 

AAGTGTTGGT CACATTtCAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 24 0 0 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 24 6 0 

TTCATTAGAT GACATCCACG TAGCAACTAT GTTAAAGCAA GCCATACATC ACGCGAATCA 2520 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCATCATGAC CATATGCATA 2580 

GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 264 0 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCTAACAG CTTGGCATTA CTATCTGACG 2700 

15 GTATCCATAT GTTTAGCGAC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTTCGAA GTACTCGCAG 282 0 

CGTTATTTAA CGGTGTAACG CTTTTTGTAA TAAGTATTTT GATTGTTTTT GAAGCGATTA 288 0 

20 AACGTTTCTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAAT GTTAATCATT AGTATTATCG 294 0 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGGCGAC ACTTCACACA 3000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGCG 306 0 

25 CCATTACTGC AGCTAkTTTA ATTTGGGCAT TTGGATGGAC AATCGCCGAT CCTATCGCAA 3120 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GGslAGGCACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 324 0 

AAAAGGATTC ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 3 3 00 

TGAATGCATT GAGTTGTCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3360 

TATTATTAGA AAaCATTGAG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 34 2 0 

AATTAGAAAC GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 34 80 

ATTCACATAA CCATCATGCT CATCATCACG CGCATGTACA TTAATAATTT TAACCTACTG 354 0 

CCATTGCATC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 36 00 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 3 66 0 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3 72 0 

ATT AG AT ATT AATATGAAAA TAACGTGTTT TTTGTTATT 3 75 9 
(2) INFORMATION FOR SEQ ID NO: 16: 



30 



35 



40 



50 



[i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


16 : 








TAATTAT CGC 


GCATAACAAA 


ACATTAGCAG 


GACAATTATA 


TAGTGAGTTT 


AAAGAATTTT 


60 


5 


TTCCTGAAAA 


CAGGGTGGAA 


TACTTTGTAA 


GTtACTATGA 


TTATTATCAn 


CCAGAGGCAT 


120 




ACGTACCGTC 


TACTGACACT 


TTTATTGAAA 


nAGATGCCTC 


AATCAnTGAT 


GAAATTGATC 


180 


10 


AACTACGACA 


TTCTGCTACA 


AGTGCATTAT 


TTGAACGCGA 


TGATGTAATT 


ATTATTGCTA 


240 


GTGTAAGTTG 


TATATATGGT 


TTAGGTAATC 


CTGAAGAATA 


TAAAGATTTA 


GTAGTAAGTG 


300 




TTCGAGTTGG 


TATGGAAATG 


GATAGAAGTG 


AATTACTTAG 


AAAACTTGTc 


AGATGTGCAA 


360 


15 


TATACACGAA 


ATGACATCgA TTTcCAACGA GGAACGTTTC GAGTGCGTGG 


TGATGTAGTG 


420 




GAAATATTCC 


CAGCCTCTAA 


AGAAGAACTT 


TGTATAAGGG 


TTGAGITl-lT 


CGGCGATGAG 


480 




ATTGACCGTA 


TCCGAGAAGT 


TAACTACCTA 


ACAGGTGAAG 


TGTTGAAAGA 


AAGAGAACAT 


540 


20 


TTTGCGATAT 


TCCCAGCTTC 


TCACTTCGTA 


ACACGTGAAG 


AAAAGTTGAA 


AGTTGCGATT 


600 




GAACGTATTG 


AAAAAGAATT 


GGAAGAACGA 


TTGAAAGAAT 


TACGAGATGA 


GAATAAATTA 


660 




CTAGAAG CGC 


AAAGGTTAGA 


ACAGCGTACC 


AACTATGATT 


TAGAAATGAT 


GCGAGAGATG 


720 


25 


GGATTCTGTT 


CAGGAATTGA 


AAACTATTCC 


GTACATTTAA 


CTTTGCGACC 


ACTGGGTTCG 


780 




ACACCATATA 


CTTTATTGGA 


TTACTTTGGC 


GATGATTGGT 


TAGTAATGAT 


TGATGAATCA 


840 




CATGTGA CAT 


TACCGCAAGT 


TCGAGGCATG 


TATAACGGAG 


ACAGAGCGCG 


TAAACAAGTT 


900 


30 


TTGGTGGATC 


ATGGGTTTAG 


ATTACCGAGT 


GCATTAGATA 


ACCGTCCACT 


TAAATTTGAA 


960 




GAATTTGAAG 


mAAAGACAAA 


ACAACTTGTG 


TATGTATCTG 


CAACGCCTGG 


ACCATACGAA 


1020 




ATTGAACATA 


CGGATAAGAT 


GGTTGAACAA 


ATTATTCGTC 


CTACTGGTTT 


ACTGGATCCT 


1080 


35 


AAGATTGAGG 


TTAGACCTAC 


TGAAAATCAA 


ATTGACGATT 


TATTAAGTGA 


AATTCAAACA 


1140 




AG AGTg AG CG 


TAATGAACGC 


GTACTTGTTA 


CAACGCTCAC 


TAAAAAGATG 


AGTGAAGATT 


1200 


40 


aACCACATAC 


ATGAAAGAaG 


CGGGTATTAA 


aGTtAATTAT 


CTGCATTCAG 


AAATCAAGAC 


1260 


ATTAGAACGA 


ATTGAAATAA 


TTAGAGACTT 


ACGAATGGGT 


ACATATGATG 


TTATCGTAGG 


1320 




TATTAATTTA 


TTAAGAGAGG 


GTATTGATAT 


ACCAGAAGTT 


TCTCTAGTTG 


T CAT ATTAG A 


1380 


45 


TG CAG AT AAA 


GAAGGGTTTT 


TACGTTCTAA 


CCGCTCATTA 


ATTCAAaCAA 


TAGGTAGAgC 


1440 




TGCGCGTAAC 


GATAAaGGTG 


AAGTCATTAT 


GTATGCCGAT 


AAAATGACTG 


ATTCGATGAA 


1500 




GTATGCAATT 


GATGAGACAC 


AACGTCGTCG 


AGAAATACAG 


ATGAAACATA 


ATGAAAAACA 


1560 


50 


TGGTATTACA 


CCTAAAACAA 


TTAATAAAAA 


AATACATGAT 


TTAATTAGTG 


CTACTGTTGA 


1620 




AAATGACGAA 


AATAATGACA 


AAGCACAAAC 


TGTGATACCT 


AAGAAGATGA 


CGAAAAAAGA 


1680 
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10 



15 



20 



TTTCGAGAAA GCTACAGAAT TAAGAGATAT GTTATTTGAA TTAAAAGCAG AAGGGTGACA 1800 

AGTAAATGAA AGAACCATCC ATAGTAGTAA AAGGTGCTCG TGCGCATAAC TTGAAAGATA 1860 

TTGATATCGA ACTACCTAAA AaTAAATTAA TTGTTATGAC AGGTTTATCT GGGTCAGGTA 192 0 

AATCGTCATT AGCATTCGAT ACTATATATG CTGAAGGACA ACGACGTTAT GTTGAATCAT 1980 

TAAGTGCCTA TGCGCGTCAA TTTTTAGGCC AAATGGACAA ACCAGATGTT GATACAATTG 204 0 

AAGGATTATC GCCAGCAATT TCAATAGATC AAAAAACAAC AAGTAAAAAT CCAAGATCAA 2100 

CTGTAGCAAC AGTAACAGAA ATATATGATT ATATACGTTT GTTATATGCA CGTGTTGGTA 216 0 

AACCTTACTG TCCAAATCAC AATATAGAAA TTGAATCGCA AACAGTACAA CAAATGGTTG 2220 

ACCGCATTAT GGAATTAGAG GCACGTACAA AGATTCAATT ATTAGCACCT GTCATCGCTC 22 8 0 

ATCGTAAAGG TAGTCATGAA AAGCTAATCG AAGATATTGG TAAAAAAGGT TATGTACGTT 2 34 0 

TAAGAATCGA TGGCGAAATT GTTGATGTAA ATGATGTACC TACTTTAGAT AAGAACAAGA 24 0 0 

ATCATACAAT AGAAGTTGTT GTAGACCGAT TAGTTGTTAA AGATGGAATT GAAACACGAC 24 6 0 

TAGCTGACTC TATAGAAACT GCCTTAGAGC TTTCAGAAGG ACAATTAACA GTCGATGTCA 252 0 

25 TTGACGGGGA AGACCTTAAG TTTTCAGAAA GCCATGCTTG TCCTATATGT GGATTTTCAA 2580 

TCGGAGAGTT AGAACCAAGA ATGTTTAGCT TTAACAGTCC TTTTGGTGCT TGTCCGACAT 2 64 0 

GTGATGGCTT AGGCCAAAAG TTAACAGTCG ATGTAGACTT GGTTGTTCCC GACAAAGATA 27 00 

AGACGCTAAA CGAAGGTGCA ATAGAACCTT GGATACCGAC GAGTTCTGAT TTTTATCCAA 2760 

CATTGTTAAA ACGTGTTTGT GAAGTTTATA AAATCAATAT GGATAAACCT TTTAAAAAGT 2 820 

TAACAGAACG TCAACGTGAT ATTTTATTGT ATGGTTCTGG TGACAAAGAA ATTGAATTTA 2 880 

CATTTACACA ACGTCAAGGT GGTACTAGAA AACGAACAAT GGTTTTCGAG GGTGTAGTTC 2 94 0 

CTAATATAAG TAGACGATTC CATGAATCTC CTTCAGAATA TACACGTGAA ATGATGAGTA 3 0 00 

AATATATGAC TGAACTACCT TGCGAAACTT GTCATGGAAA GCGATTGAGT CGTGAAGCkT 3 06 0 

TATCTGTTTA TGTAGGTGGT TTAAATATTG GTGAAGTAGT CGAATATTCA ATCAGTCAAG 312 0 

CGCTGAACTA TTATAAAAAC ATTGATTTGT CAGAACAAGA TCAAGCGATT GCAAATCAAA 3180 

45 TATTGAAAGA AATTATTTCC CGACTCACTT TTTTAAATAA TGTGGGACTT GAATATTTAA 3 24 0 

CGTTAAACAG AGCTTCAGGT ACACTTTCAG GTGGTGAAGC ACAACGTATT CG ATT AG CAA 33 0 0 

CGCAAATTGG GTCGCGTTTG ACTGGTGTCT TATATGTATT AGATGAGCCA TCAATTGGAC 33 60 

50 TG CAT CAAAG AGATAATGAT CGATTAATTA ATACACTTAA AGAAATGAGA GATTTAGGAA 34 20 

n TA~TTTAAT TGTAGTTGAA CACGATGATG ATACAATGCG TGCGGCTGAT TACTTAGTGG 34 8 0 
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AGGTAATGAA AGATAAAAAA TCATTAACAG GACAATACTT GAGTGGTAAG AAACGTATTG 3 6 00 

AAGTACCTGA ATATCGCAGA CCGGCTTCAG ATCGTAAAAT TTCTATACGT GGAGCTAGAA 3 6 60 

GCAACAATCT TAAAGGGGTT GATGTGGACA TACCACTATC AATCATGACG GTTGTTACAG 3 720 

GTGTATCAGG TTCTGGTAAA AGCTCATTAG TAAATGAAGT ATTATACAAA TCATTAGCTC 37 80 

AAAAAATTAA TAAATCTAAA GTAAAGCCAG GATTGTACGA TAAGATTGAA GGTATTGATC 3 84 0 

AACTTGATAA AATTATTGAT ATTGATCAAT CACCAATAGG TAGAACGCCA CGCTCTAATC 3 900 

CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 3 96 0 

CTAAAATTCG AGGATATCAA AAAGGGCGTT TTAGTTTTAA TGTAAAAGGT GGACGCTGTG 4 02 0 

AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 40 80 

TTCCTTGTGA AGTGTGTGAT GGTAAACGAT ATAATCGTGA GACACTAGAG GTTACTTACA 414 0 

AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA CAATTTTTTG 42 00 

AAAATATTCC TAAGATTAAG CGCAAGTTAC AAACACTAGT TGATGTTGGT CTTGGATACG 42 60 

TCACATTAGG TCAACAAGCT ACAACGTTAT CAGGTGGTGA GGCTCAACGT GTGAaACTTG 4320 

25 CATCTGAACT TCATAAACGT TCAACTGGTA AATCTATTTA TATCCTAGAT GAACCGACAA 43 80 

CAGGGTTACA TGTTGACGAT ATTAGTAGAT TATTAAAAGT ATTAAACCGA TTAGTTGAAA 44 4 0 

ATGGTGATAC TGTTGTAATT ATTGAACATA ACCTAGATGT TATCAAAACA GCAGACTATA 4 50 0 

TTATAGACTT AGGTCCTGAA GGTGGTAGTG GCGGTGGTAC TATTGTTGCG ACTGGCACAC 4 56 0 

CCGAAGATAT TGCTCAGACA AAGTCATCAT ATACAGGAAA GTATTTAAAA GAAGTACTTG 4 62 0 

AACGAGATAA ACAAAATACT GAAGATAAAT AAGATTAAAA GAAGTGAAGG ATGTTATAAA 4 6 80 

TTTATCCTTC GCTTCTTTTT ATTAATTTAG TAATGAATAG TAGAAAGAAA AGATGCGTAA 4 74 0 

AAAGAATTAT GTTAAGATAG GGTCAATCTA GAGTAGTTAA ACATAAATCG AACTGGGAGT 4 800 

GGGACAGAAA TGATAAAGAA TCACTAATGA TTTATTATGT AGTGGTTCTT TGTCATTAGC 4 86 0 

CACAGCTATT GTGTACTTAA AAATAGGaat GCaTgAGTGC AACTCATGCA TAAGaAATAC 4 92 0 

TAATTTCTAA AGAAAAAGTA TTTCTTTATG TTGGGGCCCC GCCAACTTGC ATTGTTTGTA 4 9 80 

GAATTTCTTT TCGAAATTCT TTATGTTGGG GCCCCGCCAA CTTGCATTGT TTGTAGAATT 504 0 

TCTTTTCGAA ATTCTTTATG TTGGGGCCCC GCCAACTAAT TCCAATATAT CATTGTAGAG 510 0 

CTTAGGTCAT TGATTTTTGG CTCGGACTTT TATGGCGATA TGAACCATGT AAATTAAGCA 516 0 

50 AGCAATAAAT TAATGATTGA TATTGACTTG TAAAATAATA ACAATAATGA ACAATTAATA 52 2 0 

TTTATTTTAG CTTTTCAATG TAGATTGGTG TTATATTTTT GATATGATAA GAAGAGATGT 52 8 0 
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ACATTAAAGT TAGATTTAAT CGCTGGTGAA GAAGGACTAT CGAAGCCAAT TAAAAATGCT 54 0 0 

GATATATCAA GACCGGGCTT AGAGATGGCA GGTTATTTTT CACATTATGC GTCAGATAGA 54 6 0 

5 ATACAACTAT TAGGAACAAC GGAACTATCG TTTTACAATT TATTACCAGA TAAGGATCGC 552 0 

GCAGGTCGTA TGCGTAAACT ATGCAGACCA GAAACGCCTG CAATTATTGT GACACGTGGA 55 8 0 

TTGCAGCCAC CAGAAGAATT AGTTGAAGCT GCAAAAGAAT TAAATACCCC ACTTATAGTT 564 0 

10 

GCTAAAGATG CGACTACAAG TTTAATGAGT CGCTTAACAA CGTTTTTAGA GCATGCACTT 57 0 0 

GCAAAGACGA CATCTTTACA TGGTGTTTTA GTAGATGTTT ACGGTGTTGG TGTACTAATT 57 6 0 

ACCGGTGATT CAGGAATAGG TAAAAGTGAG ACTGCGTTGG AATTAGTTAA ACGTGGGCAT 5820 

15 

AGATTAGTAG CAGATGATAA TGTAGAAATA CGTCAAATTA ATAAAGATGA ACTAATAGGG 5880 

AAACCACCAA AGTTAATAGA ACATCTATTA GAAATACGTG GACTAGGTAT TATCAATGTT 594 0 

ATGACTTTAT TTGGCGCGGG TTCAATATTA ACTGAAAAAC GAATTAGATT AAATATTAAT 6000 

20 

TTGGAAAACT GGAACAAGCA AAAGTTATAT GACCGCGTAG GTCTTAATGA AGAGACGCTA 606 0 

AGTATTTTAG ATACTGAAAT CACTAAAAAA ACAATACCTG TAAGACCTGG TAGAAATGTT 6120 

25 GCGGTAATTA TTGAGGTCGC TGCAATGAAC TATCGATTAA ATATCATGGG CATTAACACG 6180 

GCCGAAGAAT TTAGTGAAAG ATTAAATGAA GAAATTATCA AGAACAGTCA TAAGAGTGAG 6240 

GAGTAGGTTG AATGGGTATT GTATTTAACT ATATAGATCC TGTGGCATTT AACTTAGGAC 6 3 00 

30 CACTGAGTGT ACGATGGTAT GGAATTATCA TTGCTGTCGG AATATTACTT GGTTACTTTG 63 60 

TTgCACAACG TGCACTAGTT AAAGCAGGAT TACATAAAGA TACTTTAGTA GATATTATTT 64 2 0 

TTTATAGTGC ACTATTTGGA TTTATCGCGG CACGAATCTA TTTTGTGATT TTCCAATGGC 64 80 

35 CATATTACGC GGAAAATCCA AGTGAAATTA TTAAAATATG GCATGGTGGA ATAGCAATAC 6 54 0 

ATGGTGG TTT AATAGGTGGC TTTATTGCTG GTGTTATTGT ATGTAAAGTG AAAAATTTAA 66 00 

ACCCATTTCA AATTGGTGAT ATCGTTGCGC CAAGTATAAT TTTAGCGCAA GGAATTGGAC 6 6 60 

40 

GCTGGGGTAA CTTTATGAAT CACGAGGCAC ATGGTGG AT C GGTGTCACGC GCTTTTTTAG 672 0 

AACAATTACA TTTGCCTAAT TTTATAATAG AAAATATGTA TATTAACGGC CAATATTATC 6 7 80 

ATC CAACATT CTTATATGAA TCCATTTGGG ATGTCGCTGG ATTTATTATC TTAGTTAATA 6 84 0 

TTCGTAAACA TTTAAAATTA GGAGAAACAT TCTTTTTATA TTTAACTTGG TATTCAATTG 6 90 0 

GTCGATTCTT TATAGAAGGA TTACGTACAG ATAGCTTAAT GCTCACAAGT AATATTAGAG 6 96 0 

50 TTGCACAATT AGTATCAATT CTTTTAATTT TAATAAGTAT AAGTTTAATT GTATATAGAA 70 2 0 

-rATAAOTA TAATCCACCG TTGTATAGCA AAGTTGGGGC GCTTCCATGG CCAACAAAAA 7 0 80 
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TTATGGCGTG 


TATACCGTCT 


TGTTAAATTT 


TCGAAAGTTT 


TTAAGAATGT 


AATTATCATT 


7200 




GAATTTTCGA 


AATTTATTCC 


AAGTATGGTA 


CTGAAAAGAC 


ATATATATAA 


ACAACTTTTA 


7260 


5 


AATATTAATA 


TCGGTAATCA 


ATCGTCGATA 


GCTTATAAAG 


TAATGTTAGA 


TATTTTTTAC 


7320 




CCAGAACTGA 


TTACGATTGG 


TAGTAACAGT 


GTTATTGGTT 


ACAATGTAAC 


AATTTTGACG 


7380 


10 


CATGAAGCAT 


TAGTTGATGA 


ATTTCGTTAT 


GGACCAGTGA 


CGATAGGATC 


TAACACTTTG 


7440 


ATTGGTGCAA 


ATGCTACCAT 


TTTACCCGGT 


ATAACGATTG 


GTGACAATGT 


AAAAGTTGCA 


7500 




GCTGGTACGG 


TTGTTTCAAA 


AGATATACCG 


GATAATGGAT 


TTGCATATGG 


CAACCCTATG 


7560 


15 


TATATAAAAA 


TGATTAGGAG 


GTGACAATTT 


TATGGCGCAA 


AAGAATAATA 


ATGTAATTCC 


7620 




AATGACTTTT 


GATGATGCAT 


TTTATCGTAA 


AATGGCTAAA 


CAGAAGTTTA 


AACAAAGAGA 


7680 




ATATAAACGA 


GCTGCTGAAT 


ACTTTGAAAA AGTGTTAGAA 


TTGTCACCTG 


ATGATCTGGA 


7740 


20 


AATTCAAATT 


GATTATGCAC 


AATGTCTAGT 


GCAACTTGGT 


ATTGCTAAAA 


AAGCAGAACA 


7800 




TTTATTTTAT 


GACAATATTA 


TTTATAATAG 


GCATCTAGAA 


GATAGCTTTT 


ATGAATTGAG 


7860 




TCAGCTCAAC 


ATTGAAGTTA 


ACGAACCAAA 


CAAGGCATTC 


TTGTTTGGTA 


TTAATTATGT 


7920 


25 


TATTGTTAGC 


GACGACCAAG 


ATTATAGAGA 


TGAATTAGAT 


CAAATGTTTG 


ATGTGAAATA 


7980 




TCAAAGTGAA 


GAACAAATTG 


AACTTGAAGC 


TCAATTGTTT 


GTAGTTCAAA 


TACTATTCCA 


8040 




ATATCTTTTT 


TCTCAAGGTC 


GATTAAAAGA 


TGCAAAGAAT 


TATGTCTTAC 


ATCAACCACA 


8100 


30 


AGAAGTTCAA 


GATCATCGTG 


TAGTACGTAA 


TTTATTGGCA 


ATGTGTTATT 


TATATCTCGG 


8160 




TGAATATGAT ACgGCTAAAG CATTGTACGA aGCACtATTA CAAGAGGATA GTACaGATAT 


8220 




ATATGCATTA 


TGCCATTATA 


CTTTGCTACT 


TTATAACACT 


AAGGAAAATG 


AACAATATCA 


8280 


35 


AAAATATTTA 


AAAATATTAA 


ACAAAGTTGT 


ACCTATGAAT 


GACGATGAAA 


GTTTTAAATT 


8340 




AGGTATTGTA 


TTAAGTTATT 


TAAAGCAGTA 


TCGTGCATCA 


CAACAATTGT 


TGTACCCTTT 


8400 


40 


ATATAAAAAA 


GGGAAATTTT 


TATCAATTCA 


AATGTACAAT 


GCTTTAGCAT 


ATAATTATTA 


8460 


TTATTTAGGT 


GAAGAAGACG 


AAAGT CATT A 


CTACTGGGAT 


AAATTGAAGC 


AAATTT CTAA 


8520 




AGTGGAAATT 


GGACATGCGC 


CTTGGGTAAT 


TGAAAATAGC 


AAAGAAGTTT 


TTGACCAACA 


8580 


45 


TATTTTGCCA 


TTACTTCAAA 


GTGATGACAG 


TCATTATCGT 


TTATATGGTA 


TTTTTTTATT 


8640 




GG AT CAATTA 


AATGGTAAAG 


AAATTGTGAT 


GACGGAAAGT 


ATTTGGCAGG 


TTTTGGAAAA 


8700 




TCTAAATAAT 


TATGAGAAAT 


TGTATTTAAC 


GTATTTAGTT 


CAAGGTTTAA 


CGCTCAATAA 


8760 


50 


ATTAGACTTC 


ATTCATCGCG 


GCTTATTAAC 


GCTTTACCAT 


AATGAATTAT 


TTGTAAGTGA 


B820 




AAATGATGTA 


ATGGTTGCAT 


GGATTAATCA 


AGGTGAACTC 


ATAATTGCTG 


AAAAAGTAGA 


B880 
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TCGAAACGTT ACAAAGAAGC AAATTACAAC ATGGTTAGGC ATAACACAAT ATAAACTGAA 900 0 

CAAAATGATT GAATTTCTCT TGAGCATATA GATTTATGAA AAGTTAGATT TATTATATAA 906 0 

5 TGCGCATAAT GATTAATAAT GAGGAGGCGT TAATAAAATG ACTGAAATAG ATTTTGATAT 912 0 

AGCAATTATC GGTGCAGGTC CAGCTGGTAT GACTGCTGCA GTATACGCAT CACGTGCTAA 918 0 

TTTAAAAACA GTTATGATTG AAAGAGGTAT TCCAGGCGGT CAAATGGCTA ATACAGAAGA 924 0 

W AGTAGAGAAC TTCCCTGGTT TCGAAATGAT TACAGGTCCA GATTTATCTA CAAAAATGTT 93 00 

TGAACACGCT AAAAAGTTTG GTGCAGTTTA TCAATATGGA GATATTAAAT CTGTAGAAGA 9360 

TAAAGGCGAA TATAAAGTGA TTAACTTTGG TAATAAAGAA TTAACAGCGA AAGCGGTTAT 9420 

15 

TATTGCTACA GG TGCAGAAT ACAAGAAAAT TGGTGTTCCG GGTGAACAAG AACTTGGTGG 94 80 

ACGCGGTGTA AGTTATTGTG CAGTATGTGA TGGTGCATTC TTTAAAAATA AACGCCTATT 954 0 

CGTTATCGGT GGTGGTGATT CAGCAGTAGA AGAGGGAACA TTCTTAACTA AATTTGCTGA 9600 

20 

CAAAGTAACA ATCGTTCACC GTCGTGATGA GTTACGTGCA CAG CGT ATT1' TACAAGATAG 3650 

AGCATTCAAA AATGATAAAA TCGACTTTAT TTGGAGTCAT ACTTTGAAAT CAATTAATGA 9720 

25 AAAAGACGGC AAAGTGGGTT CTGTGACATT AACGTCTACA AAAGATGGTT CAGAAGAAAC 97 8 0 

ACACGAGGCT GATGGTGTAT TCATCTATAT TGGTATGAAA CCATTAACAG CGCCATTTAA 984 0 

AG ACTTAGG T ATTACAAATG ATGTTGGTTA TATTGTAACA AAAGATGATA TGACAACATC 990 0 

30 AGTACCAGGT ATTTTTGCAG CAGGAGATGT TCGCGACAAA GGTTTACGCC AAATTGTCAC 9960 

TGCTACTGGC GATGGTAGTA TTGCAGCGCA AAGTGCAGCG GAATATATTG AACATTTAAA 10020 

CGATCAAGCT TAATTCGAAG TCGAATTAAG ATGTTGAGCT GTAAATTATT TGGATATTTA 1008 0 

25 

TTTTAATAGT GTCATCACAG CGTTAAAATA ATGTCTTACT TTTAAATTAA AGCAAATTAT 1014 0 

ATAGSAAACT AGAACTTAGT ACGTATCATT TGTGCGTTTC AATGAGTTCT AGTTTTTTTA 10200 

TATGTTATAT TAAACTTATA ACTTTATGGG AGTGGGACAG AAATGATAAA GAGCCACTAA 10260 

40 

TGATTTATTA TGTAGTGGTT CTTAAACATT AG CCA CAG CT AATGTGTACT TAAAAATAGG 10320 

AATACATGAG TAAAACTCAT GCATAAGAAA TACTAATTTC TATAGAAAAA GTATTACTTT 10380 

ATCGTTGTCC CACCCCAACT TGCACATTAT TGTAAGCTGA CTTTCCGCCA GCTTCTGTGT 10440 

45 

TGG GGCCCCG CCAACTTGCA CATTATTGTA AGCTGACTTT TCGTCAgCTT CTGTGTTGGG 1050 0 

GCCCCGCCAA CTTGCACATT ATTGTAAGCT GACTTTTCGT CAGCTTCTGT GTTGGGGCCC 1056 0 

50 CGCCAACTTG CATTGTCTGT AGAAATTG GG AATCCAATTT CTCTATGTTG GGGCCCACAC 1062 0 

CCCAACTCGC ATTGCCTGTA GAATTTCTTT TCGAAATTCT CTGTGTTGGG GCCCACACCC 106 8 0 
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ACTCGCATTG 


CCTGTAGAAT 


TTCTTTTCGA 


AATTCTCTGT 


GTTGGGGCCC 


CTGACTAGAG 


10800 




TTGAAAAAAG 


CTTGTTGCAA 


GCGCATTTTC 


ATTCAGTCAA 


CTACTAGCAA 


TATAATATTA 


10860 


5 


TAGACCCTAG 


GACATTGATT 


TATGTCCCAA 


GCTCCTTTTA 


AATGATGTAT 


ATTTTTAGAA 


10920 




ATTTAATCTA 


GACATAGTTG 


GAAATAAATA 


TAAAACATCG 


TTGCTTAATT 


TTGTCATAGA 


10930 


1 u 


ACATTTAAAT 


TAACATCATG 


AAATTCGTTT 


TGGCGGTGAA 


AAAATAATGG 


ATAATAATGA 


11040 


AAAAGAAAAA 


AGTAAAAGTG 


AACTATTAGT 


TGTAACAGGT 


TTATCTGGCG 


CAGGTAAATC 


11100 




TTTGGTTATT 


CAATGTTTAG 


AAGACATGGG 


ATATTTTTGT 


GTAGATAATC 


TACCACCAGT 


11160 


15 


GTTATTGCCT 


AAATTTGTAG 


AGTTGATGGA 


ACAAGGAAAT 


CCATCCTTAA 


GAAAAGTGGC 


11220 




AATTGCAATT 


GATTTAAGAG 


GTAAGGAACT 


ATTTAATTCA 


TTAGTTGCAG 


TAGTGGATAA 


11280 




AGTCAAAAGT 


GAAAGTGACG 


TCATCATTGA 


TGTTATGTTT 


TTAGAAGCAA 


GTACTGAAAA 


11340 


20 


ATTAATTTCA 


AGATATAAGG 


AAACGCGTCG 


TGCACATCCT 


TTGATGGAAC 


AAGGTAAAAG 


11400 




ATCGTTAATC 


AATGCAATTA 


ATGATGAGCG 


AGAGCATTTG 


TCTCAAATTA 


GAAGTATAGC 


11460 




TAATTTTGTT 


ATAGATACTA 


CAAAGTTATC 


ACCTAAAGAA 


TTAAAAGAAC 


GCATTCGTCG 


11520 


25 


ATACTATGAA 


GATGAAGAGT 


TTGAAACTTT 


TACAATTAAT 


GTCACAAGTT 


TCGGTTTTAA 


11580 




ACATGGGATT 


CAGATGGATG 


CAGATTTAGT 


ATTTGATGTA 


CGATTTTTAC 


CAAATCCATA 


11640 




TTATGTAGTA 


GATTTAAGAC 


CTTTAACAGG 


ATTAGATAAA 


GACGTTTATA 


ATTATGTTAT 


11700 


30 


GAAATGGAAA 


GAGACGGAGA 


TTTTCTTTGA 


AAAATTAACT 


GATTTGTTAG 


ATTTTATGAT 


11760 




ACCCGGGTAT 


AAAAAAGAAG 


GGAAATCTCA 


ATTAGTAATT 


GCCATCGGTT 


GTACGGGTGG 


11820 


35 


ACAACATCGA 


TCTGTAGCAT 


TAGCAGAACG 


ACTAGGTAAT 


TATCTAAATG 


AAGTATTTGA 


11880 


ATATAATGTT 


TATGTGCATC 


ATAGGGACGC 


ACATATTGAA 


AGTGGCGAGA 


AAAAATGAGA 


11940 




CAAATAAAAG 


TTGTACTTAT 


CGGTGGTGGC 


ACTGGCTTAT 


CAGTTATGGC 


TAGGGGATTA 


12000 


40 


AGAGAATTCC 


CAATTGATAT 


TACGGCGATT 


GTAACAGTTG 


CTGATAATGG 


TGGGAGTACA 


12060 




GGGAAAATCa 


GAGATGAAAT 


GGATATACCA 


GCACCAGGAG 


ACATCAGAAA 


TGTGATTGCA 


12120 




GCTTTAAGTG 


ATTCTGAGTC 


AGTTTTAAGC 


CAACTTTTTC 


AGTATCGCTT 


TGAAGAAAAT 


12180 


45 


CAAATTAGCG 


GTCACTCATT 


AGGTAATTTA 


TTAATCGCAG 


GTATGACTAA 


TATTACGAAT 


12240 




GATTTCGGAC 


ATGCCATTAA 


AGCATTAAGT 


AAAATTTTAA 


ATATTAAAGG 


TAGAGTCATT 


12300 




CCATCTACAA 


ATACAAGTGT 


GCAATTAAAT 


GCTGTTATGG 


AAGATGGAGA 


AATTGTTTTT 


12360 


50 


GGAGAAACAA 


ATATTCCTAA 


AAAACATAAA 


AAAATTGATC 


GTGTGTTTTT 


AGAACCTAAC 


12420 




GATGTGCAAC 


CAATGGAAGA 


AGCAATCGAT 


GCTTTAAGGG 


AAGCAGATTT 


AATCGTTCTT 


12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 126 00 

GAAACAGATG GTTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12 6 60 

" CCGTTTATTG ATT AT GT CAT TTGTAGTACA CAAACTTTCA ATGCTCAAGT TTTGAAAAAA 12 720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 12 7 BO 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 12 840 

w 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 12 900 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AATCATATTA TGATATGATA 12960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13020 

15 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13080 

AGACGT 13 086 
20 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

30 CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 6 0 

TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 12 0 

ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 180 

35 

GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 24 0 

AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 300 

CTCTCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 36 0 

40 

CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 42 0 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 48 0 

45 CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 54 0 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 6 00 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 66 0 

so TCTGCCACGT ATAATGTCTG CTGCTTTTTC AGCTAACATT AAAACAGGTG CGTGTATATT 72 0 
.^ CT . ^TATOTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT C CAT ACCGTG 78 0 
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ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 90 0 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

5 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 1020 

TGTTGATAAA TAATTAAAGC GGATACTTGG TTTTTCGAAT GGATCTTTAG ATTTGATTTT 108 0 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTGCGAC 114 0 

10 

CGCTGCCTTT TGACCATCAT ATCTTACAGC TATTGGTAAG AAATGGAACA TTAAGTTAGG 12 0 0 

ATAAtCAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 1260 

1S TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAtATCTAA 132 0 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 
(2) INFORMATION FOR SEQ ID NO: 18: 

20 fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

TAATGCTATT GGCAACACCA TATATGAAAn CTCCAAACGA TCCTAAACCG ACTATAGATT 6 0 

30 

CACCAAATTT nACAATCCAT GAATAAAGTA GTGGCCATAA GAATAACAAT ATGACAACTA 12 0 

AAAATGTACA GTAAAATGCA GTCATAATTG GAACTAGACG TTTACCACTA AAAAATGATA 180 

ATGCTAATGG TAATTCTGTT TCACTAAACT TATTGTATGC ATAAGCTGCT ATTAAACCTA 24 0 

35 

TTACAATACC AACAAAGACA TTGCCATTAT TCATCTTTTC AAAAGCTGAA TTTATTTCCG 300 

ArGCTTTCAT TCCTAATAAA GGCGCTAATT TCATTGGTGA TAATACAACT GTAACTAAAA 360 

AATATCCTAA CGTrGCTGCA rGCGsGACTG CACCATCATT TTTCTTTGCC ATTCCTATAG 42 0 

40 

CTACACCAAT TGCAAATAAA ATACCTAATT GCTCTAAAAT CGTAGTACCT ACCGTAGTAA 4 80 

AGAACATTGC GATTTTCGGC GTCGCATGAA GTGCATTTAA CGTATTACCA ATT CCGGCAA 54 0 

45 TAATTGCTGC AGCCGGTAAA ATGGCAACTG GTAACATTAA CGAACGCCCT AAATTTTGGA 6 00 

AAAATTTATA CATTGAATGT CATCCTTCTT AAAATAATGT AGAAATATAA AGATTACTAA 660 

TGTAACTAGA ATAACTACTT CGATACTCCG TTATAGTCAC CTAGGCTTAC TAACCAGCTA 72 0 

50 TATTTCTACC TCAAGTTATT TTATAAACTT TTTACAATTT CATGCAATTC TTGTTGTAAC 780 

TTTGCTGTTC GTGTTTCAAT CTCTTTTGTA ATATAATCGA TACGCTCGTT TCGTTTTAAA 84 0 
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15 



20 



AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 96 0 

ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 1020 

CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 108 0 

TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT AAAAATTTAC 114 0 

CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA TTAGGATATA 12 00 

ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 126 0 

TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 1320 

AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 13 76 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7363 base pairs 

(B) TYPE: nucleic acid 

(C) STRANPEDNESS: double 

(D) TOPOLOGY: linear 



35 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TTGTCATACC AATATTTTGT AAAATATGGA ACACAAGTAA AGTGACGAAA CCAACGATAA 6 0 

AGATTTTGTT AAATTGATCT TCAATTTTCG CAGCTAATCT TATTAGATGG AAGATTAAAA 12 0 

^0 ATAAAAATAT TAAGATCAAT ATGACAGAAC CGATAAAGCC AAGTTCCTCT CCAATCACTG 180 

AAAAGATAAA GTCAGTATGA TTTTCAGGTA TATAAACTTC ACCGTGATTG TATCCTTTAC 24 0 

CTAGTAACTG TCCAGAACCG ATAGCTTTAA GTGATTCAGT TAAATGaTAG CCATCACCAC 3 00 

TACTATATGT ATAGGGGTCA AGCCATGAAT TGATTCGTCC CATTTGATAC AGTTGGaCAC 360 

CTAAJAAATT TTCAATTAAT GCGGGTGCAT ATAGaATACC TAAAATGACT GTCATTGCAC 420 

CAACaATACC TGTAATAAAG ATAGGTGCTA AGATACGCCA TGTTATACCA CTTACTAACA 4 80 

TCACACCTGC AATAATAGCA GCTAATACTA ATGTAGTTCC TAGGTCATTT TGCAGTAATA 54 0 

TTAAAATACT TGGTACTAAC GAGACACCAA TAATTTTGAA AAATAATAAC AAATCACTTT 6 00 

GGAATGATTT ATTGAATGTG AATTGATTAT GTCTAGAAAC GACACGCGCT AATGCTAAAA 6 60 

TTAAAATAAT TTTCATGAAT TCAGATGGCT GAATACTGAT AGGGCCAAAC GTGTACCAAC 72 0 

TTTTGGCACC ATTGATAATA GGTGTAATAG GTGACTCAGG AATAACGAGC AAGCCTATTA 7 80 

ATAATAGACA GATTAAGAAA TACAATAAAT ATGTATAATG TTTAATCTTT TTAGGTGAAA 84 0 

TAAACATGAT GATACCTGCA AAAATTGCAC CTAAAATGTA ATAAAAAATT TGTCTGATAC 900 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA CCCAGTCTAC TTTGCGAAnC aATGCTTATC 102 0 

CGGCTGTTGA CGAGATGAAT AATTCATTGC AAACTCCTTT TATACTCACT AATGTTTATA 1080 

TCAATTTTAC ATGACTTTTT AAAAATTAGC TAGAATATCA CAGTGATATC AGCTATAGAT 114 0 

TTCAATTTGA ATTAGGAATA AAATAGAAGG GAATATTGTT CTGATTATAA ATGAATCAAC 1200 

ATAGATACAG ACACATAAGT CCTCGTTTTT AAAATGCAAA ATAGCATTAA AATGTGATAC 1260 

TATTAAGATT CAAAGATGCG AATAAATCAA TTAACAATAG GACyAAATCA ATATTAATTT 1320 

ATATTAAGGT AGCAAACCCT GATATATCAT TGGAGGAAAA CGAAATGACA AAAGAAAATA 13 8 0 

TTTGTATCGT TTTTGGAGGG AAAAGTGCAG AACACGAAGT ATCGATTCTG ACAGCACAAA 1440 

ATGTATTAAA TGCAATAGAT AAAGACAAAT ATCATGTTGA TATCATTTAT ATTACCAATG 1500 

ATGGTGATTG GAGAAAGCAA AATAATATTA CAGCTGAAAT TAAATCTACT GATGAGCTTC 1560 

ATTTAGAAAA TGGAGAGGCG CTTGAGATTT CACAGCTATT GAAAGAAAGT AGTTCAGGAC 1620 

AACCATACGA TGCAGTATTC CCATTATTAC ATGGTCCTAA TGGTGAAGAT GGCACGATTC 16 BO 

AAGGGCTTTT TGAAGTTTTG GATGTACCAT ATGTAGGAAA TGGTGTATTG TCAGCTGCAA 1740 

25 GTTCTATGGA CAAACTTGTA ATGAAACAAT TATTTGAACA TCGAGGGTTA CCACAGTTAC 18 00 

CTTATATTAG TTTCTTACGT TCTGAATATG AAAAATATGA ACATAACATT TTAAAATTAG 18 60 

TAAATGATAA ATTAAATTAC CCAGTCTTTG TTAAACCTGC TAACTTAGGG TCAAGTGTAG 1920 

GTATCAGTAA ATGTAATAAT GAAGCGGAAC TTAAAGAAGG TATTAAAGAA GCATTCCAAT 1980 

TTGACCGTAA GCTTGTTATA GAACAAGGCG TTAACGCACG TGAAATTGAA GTAGCAGTTT 2040 

TAGGAAATGA CTATCCTGAA GCGACATGGC CAGGTGAAGT CGTAAAAGAT GTCGCGTTTT 2100 

ACGATTACAA ATCAAAATAT AAAGATGGTA AGGTTCAATT ACAAATTCCA GCTGACTTAG 2160 

ACGAAGATGT TCAATTAACG CTTAGAAATA TGGCATTAGA GGCATTCAAA GCGACAGATT 2220 

GTTCTGGTTT AGTCCGTGCT GATTTCTTTG TAACAGAAGA CAACCAAATA TATATTAATG 22 8 0 

AAACAAATGC AATGCCTGGA TTTACGGCTT TCAGTATGTA TCCAAAGTTA TGGGAAAATA 2 34 0 

TGGGCTTATC TTATCCAGAA TTGATTACAA AACTTATCGA GCTTGCTAAA GAACGTCACC 24 00 

AGGATAAACA GAAAAATAAA TACAAAATTG ACTAACTGAG GTTGTTATTA TGATTAATGT 24 60 

TACATTAAAG CAAATTCAAT CATGGATTCC TTGTGAAATT GAAGATCAAT TTTTAAATCA 2 52 0 

AGAGATAAAT GGAGTCACAA TTGATTCACG AGCAATTTCT AAAAATATGT TATTTATACC 2 5 80 

50 ATTTAAAGGT GAAAATGTTG ACGGTCATCG CTTTGTCTCT AAAGCATTAC AAGATGGTGC 2 64 0 

TGGGGCTGCT TTTTATCAAA GAGGGACACC TATAGATGAA AATGTAAGCG GGCCTATTAT 27 00 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC TAATGGTAAA ACAACGACTA AAGATATGAT 2820 

TGAAAGTGTA TTGCATACCG AATTTAAAGT TAAGAAAACG CAAGGTAATT ACAATAATGA 2 880 

AATTGGTTTA CCTTTAACTA TTTTGGAATT AGATAATGAT ACTGAAATAT CAATATTGGA 2 94 0 

GATGGGGATG TCAGGTTTCC ATGAAATTGA ATTTCTGTCA AACCTCGCTC AACCAGATAT 3000 

TGCAGTTATA ACTAATATTG GTGAGTCACA TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3 06 0 

TGCTAAAGCT AAATCTGAAA TTACAATAGG TCTAAAAGAT AATGGTACGT TTATATATGA 312 0 

TGGCGATGAA CCATTATTGA AACCACATGT TAAAGAAGTT GAAAATGCAA AATGTATTAG 318 0 

TATTGGTGTT GCTACTGATA ATGCATTAGT TTGTTCTGTT GATGATAGAG ATACTACAGG 3 24 0 

TATTTCATTT ACGATTAATA ATAAAGAACA TTACGATCTG CCAATATTAG GAAAGCATAA 33 00 

TATGAAAAAT GCGACGATTG CCATTGCGGT TGGTCATGAA TTAGGTTTGA CATATAACAC 336 0 

AATCTATCAA AATTTAAAAA ATGTCAGCTT AACTGGTATG CGTATGGAAC AACATACATT 3420 

AGAAAATGAT ATTACTGTGA TAAATGATGC CTATAATGCA AGTCCTACAA GTATGAGAGC 34 8 0 

AGCTATTGAT ACACTGAGTA CTTTGACAGG GCGTCGCATT CTAATTTTAG GAGATGTTTT 354 0 

25 AGAATTAGGT GAAAATAGCA AAGAAATGCA TATCGGTGTA GGTAATTATT TAGAAGAAAA 3 600 

GCATATAGAT GTGTTGTATA CGTTTGGTAA TGAAGCGAAG TATATTTATG ATTCGGGCCA 3 6 60 

GCAACATGTC GAAAAAGCAC AACACTTCAA TTCTAAAGAC GATATGATAG AAGTTTTAAT 372 0 

AAACGATTTA AAAGCGCATG ACCGTGTATT AGTTAAAGGA TCACGTGGTA TGAAATTAGA 378 0 

AGAAGTGGTA AATGCTTTAA TTTCATAGAG ATTAGTCGAG GGACCTTTTA CTTATAAAAA 3 84 0 

TGATTTGAAT TAATACTAAA AGATTACAAA GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3 900 

TTGCCTTTTT CTTTTTATGT TAAATCTATA AATTTGAAAC TAAATCAAGG TTAATTCTAT 3 960 

GTACACACTT TATATAGGAA GTAGTTTGAA TGTTTATATA ATGTTTTACA AAAAGATGTA 4 02 0 

GTATTATAAT GTCTAATTTC ACATGTGTTT CAGTAAAATT TGTTGTGGAA TGTTAACGAT 40 8 0 

ATACGTATTT TATAAAAaAT TTTTTATAAT GATTATTCGA ATGATGCGTA ACGCTTACAT 414 0 

CTTATCTAAT GCTAGCTTTT TGACAAAAAT ATGACAATCA ATTAATGTGA TTCTAATAAA 42 0 0 

TATTCGCAAA TTGCTTTATT G CG ATTAAAT TTTTTTGGTG GTACTATATA GAAGTTGATG 42 6 0 

AAATATTAAT GAACTTATAT GCAAAAGTAT ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4 32 0 

TATTTTGCAA AATTTTAAAG AACTAGGGAT TTCGGATAAT ACGGTTCAGT CACTTGAATC 4 380 

AATGGGATTT AAAGAGCCGA CACCTATCCA AAAAGACAGT ATCCCTTATG CGTTACAAGG 44 4 0 

AAmATATC CTTGGGCAAG CTCAAACCGG TACAGGTAAA ACAGGAGCAT TCGGTATTCC 4 50 0 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 462 0 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 46 8 0 

5 

CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 474 0 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAATGA TGAATATGGG 480 0 

ATTCATCGAT GATATGAGAT TTATTATGGA TAAAATTCCA GCAGTACAAC GTCAAACAAT 486 0 

10 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT CCAAGCTTTA GTACAACAAT TTATGAAATC 4 92 0 

ACCAAAAATC ATTAAGACAA TGAATAATGA AATGTCTGAT CCACAAATCG AAGAATTCTA 4980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 5040 

15 

ACCTGAATTA GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 5100 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 5160 

20 TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 5220 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 528 0 

AGATACTGAA AGCTATACAC ACCGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 534 0 

25 CGCTGTAACG TTTGTTAATC CAATCGAAAT GGATTATATC AGACAAATTG AAGATGCAAA 5400 

CGGTAGAAAA ATGAGTGCAy TcGTCCACCA CATCGTAAAG AAGTACTTCA AGCACGTGAA 5460 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 5 52 0 

30 CGCATTTCTA CAGAGTTGTT AAATGAATAT AACGATGTTG ATTT AGTTG C TGCACTTTTA 5580 

CAAGAGTTAG TAGAAGCAAA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 564 0 

TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 5700 

35 

AATCCTAAAT TTGACAGTAA GAGTAAACGT TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 5760 

ACAAAAAAAT TCGACCGTAA AGAGAAGAGC AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGT CTTGA 588 0 

40 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 594 0 

GTTAAATATT TAATTGGATT GAGATCTGTA TGCGGTTATA TCaTTCTGTG TAAATATGGT 600 0 

45 TCTCCACCAA ATGTGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

AACATAAATA AACTTTATGA AATTTCAGTA TCATGTTCTT ATAAAAAACA ATAGGGCTTT 6120 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

50 TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 624 0 

TAACCAATGC GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 63 0 0 
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C CAT AT ATT C GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 64 20 

AGTTCTAACA AGCTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA 64 80 

5 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6 54 0 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTGCCC GAAGCTCATC AACATTAAAA 66 00 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 66 60 

10 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GC CTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 67 8 0 

TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 6 84 0 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 6 90 0 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 6 960 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 702 0 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 7080 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 714 0 

25 CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 72 00 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 7260 

ACTTACTGTC AATGTGTATA AACtGTAAAT TTACTGAGGA TGATACAGTT ATACGCTTTT 73 20 

30 

TTAAATGGCG ACGTTCTAAA ATACATATCG ATTTCTTATA CTA 73 6 3 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 10470 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

40 

ixi) SEQUENCE DESCRIPTION : SEQ ID NO: 20: 

TTAACAATCG AT AAC CACAA TACTTCTATT GTAATTGTTT AACGATTTCn CGATTAAAAT 6 0 

45 CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTCATC 12 0 

AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 180 

ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGGCAAATGT 24 0 

50 TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 3 00 

f-^TAAAT^A T^TGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 3 60 
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ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCT CTAATG CCTCTTTAGG 4 80 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTTAGT CAT 54 0 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA ATCAGTGATT 60 0 

AAACATACAA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 66 0 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAGCTGATCT AACAATCCAA 72 0 

TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 780 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 

TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 96 0 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGkGG TaTCATACCG 102 0 

ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC 10 8 0 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 114 0 

ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT 12 00 

26 TTACAACAAA ATGGTATCAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 12 60 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 13 80 

GAATCTCTTC GG CAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA 144 0 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG TCATATGAAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 

CAGTTGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 1680 

CATATTGCGC TACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 174 0 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 180 0 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 19 80 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 204 0 

50 CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 216 0 
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GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AATAGCACAT 2280 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAGaCATGAC 2 34 0 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 24 00 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTTGCCCC TCCCATGTAT ATCCTACCCA 2460 

AACATGACCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2 5 80 

GTTTGTGAAT CTAGCACTTT CTTC CATGT A GTAAGTACCA TATTTATTAC GTTTCCATGC 264 0 

ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2760 

CCGCTTGTCT TCTGGCAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2 820 

ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2B80 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCGCATTCT TGGAAAGTTG CCTGTTCATT 2 94 0 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3 000 

25 TTTATTACCT ATTTGATTAG CGGTATGCCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3 060 

AACTGTGTTG CCTGATACGT AACTATGCGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 312 0 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3 24 0 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3 3 00 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 336 0 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 34 2 0 

TTGTTCGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 8 0 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 3 54 0 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 36 00 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 36 6 0 

TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 3 72 0 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 37 8 0 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 3 84 0 

CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3 90 0 

T^rrcGrGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3 96 0 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTGCGC 4 0 80 

TAGCTTGATT TAATTGAGTA GATAAATCTA ATCCGAATAA ATCCGTGACT TGCTTGATAA 414 0 

ATAGCAACAA TGCTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4 200 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATTAGTC 4 26 0 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 432 0 

TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4 38 0 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 444 0 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT 4 500 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4 560 

AAAGATGAAG GCTTTTTCCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4 62 0 

CCTTCGTAAA TAAACTTCTT TACATTTTTA AAATTACCTT CCATAAAAAT CACCCTTTAA 468 0 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATATATAGTT ATATTCATTT TCTGTTCCTG 4 74 0 

TCCAAATTTT AACCGTCGGT TGAGATGCGC TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4 8 00 

25 GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 48 6 0 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4 92 0 

CAATCTGTTG CAATACACTT TCTGAAATAG AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4 9 80 

TTAATGTGTT CATAGATTCA GGCGCGCTAT CAACTAGTTC AGCAATTTTT GTATCCGTAT 504 0 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 510 0 

CTAACCCTTC AACATTTGCG ATATTGATTT TGTCCAATAA CTCAGGTTCT GCTTTGATAT 516 0 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 52 2 0 

CGTTXTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 528 0 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 534 0 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 54 0 0 

TATCTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA 54 6 0 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 552 0 

CTTCTTCTTT CTCTAAAAAC AGCTTACAGC GAACATAACC AGCGTGTTTG AT AAC CTTTT 55 8 0 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 564 0 

50 TGAATATAGA GCCATCTTCC ATAAACAAAT GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 5700 

GATCGATACG ACCTTGTTTG T CATTGATAC CTATTCTTAT AGATGCTGTA TTTTCATCTT 57 6 0 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT 58 80 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 5 94 0 

5 

ACCACTCTAT AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6 0 00 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6060 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTCCTTGC AAACGTCATT GCGTAGTTAG 6120 

10 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

AGTTACTTGT TC CAT AT CCA CTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 624 0 

TGTTGTGCTT TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAC 6300 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 63 60 

TTTGGTAGCA TTGTGTGTTA CTTTCCCATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 6420 

20 GTGCTCGTGT AGCGTTAscC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 64 8 0 

TTGTTTATCC AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 654 0 

TTTAGATGCC GAACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 66 00 

25 GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAGTTATAGA 666 0 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATGCTTCGTA 672 0 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 6780 

30 ACGCCAAATG CTGTCGTCTA CTTTTAAATT TTCAATACTT AGAGGTATCT CATATTTGGC 684 0 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCTCATTA AAAATAAATT CATTTTTACT 6 90 0 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6960 

35 

TACAGCTCTT CTAGGTGCCC AAATATTATG TCTATCAACA TAAAAGTGGG GATATTCTAC 7 020 

ATCCTGTTTG TATTTCTTCC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 7080 

GTTTCTAATC ATTATTCCTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 714 0 

40 

ATAAATATAT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 72 0 0 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 72 6 0 

^ TTTAGGTGCC GGTGTAGTTT TGTCTGGATG ATATGGTGGT CTAACAAAAT ATTTAACCCC 73 2 0 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGGACTTC CTGTTGCGTT 73 B 3 

ATTTGTATAC CAGTTTTGAT CTACGCCATA CCAATAGTCT TTTGTGCATG GTCCCACTAC 74 4 0 

50 AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 7500 

AAAAT CTTTT GTATTTCTAA TTATCTTGAA ATCTCTACCT CTATAATTGG ATTTTTGAGC 7 56 0 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA 
AGCGATATAT AACGCCCATT CAACCACTTC 
5 AGGTAATCCC ATGTATGCAC CTCATTTCAA 

ACTCTTAACT GTTATTTACA TTTACCAAAC 
ATCCCTTTAA GCATGGTAAT CACCTCCTTT 

W 

TGACAATCGT ACTGAAGATA GTCCCTATCA 
TATTCTTGGC ATTCTTTTCT TTATT C TTTT 
CAAAATTTCT ATCTAATTTG TCATAAATCT 

15 

TGTCGAATTT TTCAAACATA GTCTTATCAT 
GTTCATGTCG TTTGGTAAAT CCAAACATTA 

ACAAGCATTA CACCTGTGAC TTTTCATCTT 

20 

AAGCGTATTC TTCTTTATCG ATTAAACCCT 
TATAGTAACC CCAAACATAA AAAGTTTTAA 

2s TTATTTAAAC GTCCCCCTCA GTACTTGTTT 

TTAACATAGC GTTTTGTTGA GCTAATTCCA 
TTTGCATACT CGCAACCATT CCGCGAAGTT 

30 GGTTTGATGC ATTCGGTACG TCTTCTTTTT 
TAGTGAAAAC AAACTTTCTA GGTTCGAACT 
CAT CTA CATC TAAACTATTG CGTAAACCGC 

35 TATCGTTTAC TGTGATTTTC ATTATTTCCA 
GGCAJTCGCT CCAGAACCTG ATGTTTTACC 
TAAAGTAGTG CTACTTGTTT TGGATAGTAA 

40 

TGAGTCAACT ACATTCGCTT TACTCAATTG 
TCCCTCAATA ACGCCACCTG GATAAGTTCC 
CGGTTCAGTT AGATTGATTG TTGTACCTAC 

45 

TGATTTATGT TCATTAGGAA CTGTCCACTG 
TGTGTAAATC TTTTTAGAGT TATAAGGTGT 
AACGAATACC GATAAATAAC CCTCATAACT 

50 

TGTTGCATAG TAATTACCAG CAGTTAAATA 



ACCAGGTACA TCAATAGCTA TTTTGTTTTT 76 8 0 

ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 7 74 0 

TCAAAATAAA AAGCCAGTGC CGAAGCACTG 780 0 

CAGAAGCACG CCCAGAAGCT ATATCCTAAA 786 0 

AAATACCAAA AACAGTTCTT AGTAAAGCTA 7 92 0 

AACCTAGAAT CCACATTTTT ATGTCTCTAA 7980 

CATCTTCTAC CTTGTCGCGC TTTAATTCTT 804 0 

TTTCTTGCGC TCTAAGACTA TCTTCTATTC 8100 

TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 816 0 

TGCCACCCAC TTTATTCAAA TTAAAAAGCC 822 0 

TTGTTTCTGG ATATTTTTCT CCAGTGATTA 828 0 

TGTCTACGTA CCACTTAATT TGCTCGTTTT 834 0 

TGTCTTTAAA AGTTGGATAA ATCATCTTCA 84 0 0 

TGTTAGTTTT CAGTTCAGTC AACTGTTGTG 84 6 0 

TTGTTAATAC GTTTACTTGT GCCACCTGCA 852 0 

CCTCATCACT TAAATCTGAC GCACTTTGTT 858 0 

CGAAATTGCT ATTGTATTTA ATTTCGCCGT 864 0 

CTTCTTTAAA TTTAATAGGC ACATTGTTAT 870 0 

CAGTATTAAC GAATCCGATA ACTTCGTTTT 876 0 

CCCCATAATT TTAGTTATAG TAACTTTGTT 882 0 

TAAATCAAAG TACACATCGT TATCTATTCT 8880 

GCACTCATAA ATACCGCCAC CGTTGCCGTC 8 94 0 

AATCGCGTTA GGTAATGCGG TTAGTCCGAA 9000 

ACTTACCAAC AAAATAGAAT AGTTTGTGTA 9060 

ACCATTTGCG CCACCGTCGA ACAATACCGT 912 0 

TTGCTCAAGT CTGCCGTTTG TGATTGATCG 9180 

GAAGTTAAAT AGCTTGTTTG TATCATCTTT 924 0 

TTCAACGCTA CCTGGTAAAT CCGGCACTCT 93 00 

TCCCAAATCG CCTTGCGCAT TATTTAAGTT 93 60 
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GAATTTATCA TCTACATACT GCTTAGCTTG ATTTAAAGCG TTGTTAGACG TTTCTTCAAC 94 8 0 

AAATTGCTTA GTTAAGTTTC CATCATTCTT TTTATAAAAC GGGTACCATG TGCCGTAGAT 954 0 

5 TTTGTATTTT GTGTACTCAT CGTTTGAATC GTCTGGGTAC CATGTTGCAC GAGCAGTATT 96 00 

ATT AT CAACA ACATAAACAA CTAACACACC AGATTTGCTT GATGTATAAG TTGATTCATC 9660 

GAACGAAGAA CCGTCATCAA CACCATCTTG TCCAGGCTTC TCTAACGTGC CTATATCCGT 97 2 0 

w 

CTTTTCTGGC GCATCTGTTG CATTAGTAAT ATGAATAATC CTAGATGTGT TAACTGCGCT 9780 

TAAAACGCTA TCTATGGACT GCTCATACGA TTCAATTGCT TTACCGTAAT CATCTGTAAG 984 0 

TTTAGACTTT TGCCAATTCG TTGTTGAATT ACCTTTAACA AGGTCAGCGC CATTGATTTG 9900 

15 

TTGTTCAACT TCGTTAACAC GTTCAAAAAT CGCTTGCTCT TTTTCAACTA TTTTATCGAA 9960 

TTCAGCTGTA ACAGCTTGTG TTGCACTAGT TTGCGTCGCA GTAATAGCTT GTATAGCTTC 10020 

20 GTTTTGCTTG ATTTCGATTT GTTGAATGCC TTTTGTCGCA CTATCATTCA CTTTTGCTAT 10080 

TAACGTTTGT GTATCAGCCA TATTTTGCTT TAATTGGTTA AAATCTTTAC CGACAGCTTC 1014 0 

GAT AG TAT CT TGAATAGATT TGATATAAAC AAGCTTTGTT ATACCATCAA ACCCACTAAC 10200 

25 TAAATCATTT TCAATATTGA AGCTAAATTG ACGTT CAACA ACAACATTAT TACTCCCGTT 10260 

TTGTGTAAAG AATGCCTGAG CATGCACCTT GCCTGAATGT TTTAAAAATT CATTCGGTAT 10320 

CACATACTGC AAACGCCCAT TAATTGCGTC TACTATCGTT AATTCGTCTG AAATATAAGC 10380 

30 GCCTCTATCT ACGTTATAAT CATCGGTTTT TAAnACGATA GATGTTTTAA CATGTTCAGA 10440 

ACTTATAGAT AAGGGTCTGT TATnCTTAGT 104 7 (J 

(2) INFORMATION FOR SEQ ID NO: 21: 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3647 base pairs 
~ (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 6 0 

CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 12 0 

AAGATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 180 

AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 24 0 

TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTC CATATA TGTATTATCT 3 00 
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TCAAATTGTA ACAACTAATC CTATTGCAGG 
AGATAATGAG AATATGAAAC AACTACTTAA 
5 GCTAGTTGAT TTAGGACGTA ATGATATTCA 

TACTAAATTA ATGGTTATTG AAAAATATGA 
AGGTAAAATA AATCAAAATT TATCGCCAAT 

10 

TACCGTTTCA GG TGCACCAA AATTACGTGC 
TAAACGGGGC GTTTATAGTG GTGGTGTTGG 
TGCATTAGCA ATTCGAACGA TGATGATAGA 

15 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA 
AAGCTTATTG GAGGTGAGCC CATGATCTTA 

2Q AACCTAGTGG ATATTGTTGC TCAACATACT 
AATGTGCTGA ATCAATCGGT GG ACGCTGTT 
GACGATCAAC AGTTAATGAA AATCATATCA 

25 TGTTTAGGGG CTCAGGCACT GACTTGTTAC 
GTTATGCACG GCAAAGTTGA TACACTAAAG 
CAAGATATAC CAGAACAGTT TTCAATTATG 

30 AATTTTCCAG AAGAATTGAA AATTACTGGA 
CATAAAGAAA GACCGCATTA TGGTATTCAG 
GGTGTCAAAA TAATTACAAA TTTCATTAAT 

35 TACTAACAAG AATAAAAACT GAAACTATAT 
ATATiCTTAT TTCTCCTAGT ATTGGAACTG 
CGGAGCGAGA AATCCAACAA CAAGAATTAA 

40 

TGTATCCACA TCAACCATGT TATGAAGGGG 
AGTCAAATAG TTTCAACATT TCAACGACTG 
AAGTTATAAA ACATGGtAAT AAAAGTATTA 

45 

ATCAAATGAA CATACAAaCA ACAACTGTTG 
ACCTTGTATT CATTGGTGCA aCTGAATCAT 
50 GAAAAATGAT TGGAAAGCCT ACAATATTAA 
ACTTAACGTA TCAAATGGTA GGCGTCTTTG 



TACGATTCAA CGTGGTGAGA CGACACAAAT 42 0 

TGATCCAAAA GAATGCAGCG AACATCGTAT 4 80 

TAGAGTAAGT AAAATCGGTA CCTCAAAAAT 54 0 

ACATGTTATG CATATCGTAA GTGAAGTCAC 600 

GACAGTTATT GCGAATTTAT TACCAACAGG 6 60 

AATTGAAAGA ATATATGAAC AATATCCACA 720 

ATACATAAAT TGTAATCATA ACTTAGATTT 780 

TGAGCAGTAT ATCAACGTAG AAGCTGGTTG 84 0 

AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

GTTGTAGATA ATTATGATTC CTTTACATAT 960 

GACGTCATTG TTCAATACCC TGATGATGAT 1020 

ATTATATCTC CTGGTCCAGG GCATCCATTA 10 80 

ACCTATCAAC ACAAACCCAT TTTAGGTATT 114 0 

TACGGTGGAG AAGTCATTAA AGGCGACAAG 120 0 

GTTATATCGC ATCATCAACA TCTGTTATAT 12 6 0 

AGATATCATT CATTAATAAG TAACCCTGAC 13 2 0 

CGTACCAAAG ATTGTATACA GTCATTCGAG 13 8 0 

TACCATCCTG AATCATTTGC TACAGACTAT 144 0 

CTAGTGAAGG AAGGATGAAA ACCATGACAT 15 00 

TACTTGAAAG CGACATTAAA GAGCTAATCG 1560 

ATATTAAATA TGAATTACTT AGTTCCTATT 162 0 

CATATATTGT ACGTAGCTTA ATTAATACAA 16 8 0 

CTATGTGTGT GTGCGGCACA GGTGGTGACA 174 0 

TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 18 00 

CCTCaAATTC aGGTAGTACG GATTTGtTAA 1860 

ATGATACACC TAACCAATTA AATGAnAAAG 1920 

ATCCAATCAT GAAGTATATG CAACCAGTTA 19 80 

ACCTTGTGGG TCCATTAATT AATCCATATC 2 04 0 

ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 2100 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 2220 

ATTACACATT AAATGCGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 22 8 0 

5 GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGAT CAGT 234 0 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 24 00 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAGCAT 24 6 0 

10 TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2 58 0 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 264 0 

15 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 27 00 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2 82 0 

20 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2 650 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2 940 

25 TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3 000 

TTAGAACGTG CCTATAAGGT TAATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3 060 

CGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTCACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATGCGTT GTGACAATCT ATCTGAATTT 3 24 0 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3 3 00 

35 CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA GGTTTCATCC 3 3 60 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 3420 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 3 4 80 

40 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3 54 0 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 3 6 00 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3 64 7 

45 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 

(B) TYPE: nucleic acid 

b0 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 22: 

CcAcCTTGAC CACCTTTACG TGGAATCTTT TCmCCTkGAG CAACaTCGaT AATaTATATT 6 0 

GAAAgTCAAC AAGTTCTGGA CTAAATGTTG CTGCTAAGTT ATCGCCACCA GATTCTATGA 12 0 

AAATTAGTTC TATATCGTCA TGACGTTCTA ATAATTCGTC TATTGCTGCA AAGTTCATAG 180 

ATGCATCTTC ACGAATCGCA GTATGAGGAC ATCCACCAGT TTCAACACCA ATGATACGAC 24 0 

TTTCAGGTAG AACTCCTGAA TTTACTAATA TCTTTTCGTC TTCTTTTGTA TATATATCAT 3 00 

TTGTAATAAC GCCGATACTC ATTTCTTTTG AAAGACGTTT TACAACTTTT TCAATTAATT 360 

GTGTTTTACC TGCACCTACA GGACCACCAA TACCAATTTT AATCGGATTT GCCACAATTA 42 0 

TAACCTCCTA TGATATGAAA tTCTAACATT GaCGTTCTCA TGCGCCATTT GATTTAGTTC 4 80 

TAAACCAGGC GCTGTCATGC CAAAATCTGC TTCTTTTAAT TCGAAAATCT GCTTTCTTGT 54 0 

TCCTTCTATA TAAGGAATCA TGTGAGTAAC TATCTTTTGA CCAGCAGTTT GTCCAAGTGG 6 00 

AATAGCACGA ACAGCATTTT GAG TT AAA CT TGAAACATTT TGATATAAAT AGTAATCAAT 660 

AATCGTTTCA ATATCTACAC CTAAATGATG GCCTAGCATA GTAAAACAAA TAGCTGGATT 72 0 

TnACTTTGCT TTCTTATCTT GCATTTGTTG ATGATACCAA GCAATCCATG GGCTATtATA 78 0 

AAGTTCTAAA GCCAATTTAA CCATGCGAGT CCCCATTTGT kTTGCACCAA CACGTGTTTC 84 0 

TTTAGGTAAG TTTTGrACAr ACATCAGTTT ATCTATGTGT AATACTTTTT GTGTATCATC 9 00 

ATTTTCCAAT GCATCATAAA CTAaACGCAT GGCTAAACCA TCAGAATAGG TAAGTTGCTC 96 0 

TTGTAAAAAC ATTTTTAACC AAGCAATAAA AGTATGATCG TCATGAATTA TATTTCGTTG 102 0 

35 AATATATGTT TCAAGACCAA ATGAATGACT GAAAGCACCT GTTGGAAACT GTGAATCACA 10 8 0 

GAACTGAAAT AAT CTTAAGT GTGTATGATC AATCATGAGA ATGCCCTATA TGTCTGAAAG 114 0 

CCTTATTAAC TTTACGGTCT TCTCGAACAT ATGGGATGCC TAAACTTTTT AATAAATCTT 12 0 0 

CAACTAAATA AT CAT ATTGT ACTAGCATTT CAGTCTCTGT AAATTGTGCT GGCAAATGAC 126 0 

GATTTCCTAA TTGATGGGCT ATATCTCCCA TTTCTTGCAA TGTTCTTGGT TGAATCACTA 132 0 

AAAGATCTTC TGAATTAACA TCCACAATAA TCATATTATG GTCATCTGCG TATAAAATAT 13 8 0 

CTCCATATTG TAAGTCAATA GGTTGTTTTA AACGAATGCC TATTTCAGTG CCATGGTCTG 144 0 

TAACGACTCT TTGAATACGT TTAACAAGAT CTGAATTTTC AAGGTATACT TTTTCGACGT 1500 

GCTTTTGTTT TTCTGAATTT GACAAATTGG CAATATTGCC TTGGATTTCT TCAACAATCA 156 0 

TTCTATGTTC CTCCTAGAAT AAGAAGTATC TTTGAGTTAA TGGTAACTCA GTTGCTGCAT 16 2 0 

TACTTGTAAT TTTTTCTCCA TCTACATATA CTTCATATGT TTGTGGATCA ACGTCTAATT 16 8 0 
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GACGCACCAT GCGTTTTAAA TTTAATGCAC GATTGATACC ATTTTCATAA GCAGTTTTAG 1800 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT 186 0 

5 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 1920 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 19 BO 

CAATGTCAGC TAGTTTGCCC GGCTCGATAG ATCCTACATA TTCAGAAATA CCATGTGTAA 2 04 0 

w 

TTGCTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 210 0 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 2160 

is GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 2220 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 22 80 

CGAATGCGAT ATCTTCAGGA ATAGCCGCAT TTAAATGGTG AGTAATCATT ACCATATCTA 234 0 

Of) 

AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 24 00 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 24 60 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 2520 

25 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 2 580 

ATGCATGACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 2640 

3Q CTCCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 27 00 

AACCGACATT AATCGGTAAA CcTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 2760 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2 820 

35 TCGTAATACC ACTTTCTAAT GCGACCTCTG CTTGTTCAGG ATTAATAAAA TGAACATGAG 28 8 0 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 294 0 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTACCT ATGGCGAAAA 3 000 

40 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3 06 0 

CATTAGAAAT GACAAGGTCT GCAACGTTCA CGTCATCACG TGTTACACGA GGATTTTGCG 312 0 

CCATACCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACCGCAT 318 0 

45 

AGTCTTTTTC TATTTGAGCA AATAGATTCG TATCACCTAA ACGAATGGAA TCTCCAACAG 3 24 0 

TTGGACCGTA TAAGCTCGTA TATTGATTTT GCGTCATTTT AAAG CTCATG ATCTTTTTCC 3 3 00 

50 TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 3 3 60 

~TCATCAGTT GGGCGATAGA CACGTGACTC ATCGATAGGA CCATTGACCA T AC CA CG AAA 34 2 0 
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TTCGAAATCT AATGCTGCAT TTGCTTCATA AAAATGAAAA TGTGAGCCCA CTTGAATTGG 3 600 

TCGATCTCCT GTATTTTCAA CTTCGATAAC TGTTTCAGGA TGATGGTTAT TAATTTCAAC 3 660 

5 

CTCTGTACTT TTTGTAATAA TTTCTCCTGG TATCATTTGA CTGCCTCCTT TAAACAATAG 3 720 

GGTGATGTAC TGTGATTAAC TTAGTACCAT CGGGGAACGT AGCCTCGATT TCGATATCTG 3 760 

^ TAATCATGTG TTCGACACCA TCCATGACAT CTTCTTTGTT TAGAATTTGT CTACCATAAC 3 84 0 

TCATTAACTC TGCAACGGTC TTACCATCGC GTGCACCTTC TAATAATTCA TCGCTGATTA 3 9 00 

AAGCTAATGC CTCAGGATGA TTTAGTTTCA AACCACGTGC TTTACGACGA CGTGCAACTT 3 96 0 

15 CCGCCGCCAC TACAATCATT AATTTGTCTT GCTCTCGTTG TGTAAAATGC AAATTAAAAC 4 02 0 

CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 4 08 0 

ACAAGTAATA TACAAAGTTC AATGTGTAAT TAGAAAATTA TATTTTTAGC ATATCCGATA 414 0 

20 

TTGAAGCAAA CAATCTAATC GAAAACAAAT AGTGGAATAT ATTTATGTAA AAACCAAAAT 4 2 00 

AGTTTTTAAT ATAACTTTTC ATAGAATAGT AGTATATTAA TGAGTAATGA TTCAAAGGAA 4 260 

AGGTGAAAGA TTTGAAGATA ATAGATGTGC TTTTGAAAAA TATATCTCAG GTTGTGTTAA 4 320 

25 

TTAGTAATAA ATGGACAGGA TTATTTATCT TAATAGGATT ATTTGTAGCC GATTGGACAA 4 3 80 

TTGGATTAGC GGCTATTGTA GGTAGCATCA TCGCCTATAC TTTTGCGCGT TTTATAAATT 444 0 

3Q ATAGTGAGGC AGAGATTAAT GATGGGTTAG CTGGATTTAA TCCAGTGCTA ACTGCCATTG 4 5 00 

CGTTAACAAT CTTTTTAGAT AAGTCAGGAT TAGATATTGT TATAACAATG ATAGCAACTT 4 560 

TATTAACGTT ACCAGTTGCT GCTGCAGTGA GAGAAGTTTT AAGACCATAT AAAGTTC CG A 4 62 0 

35 TGCTGACGAT GCCTTTTGTC ATTGTGACTT GG TTTACAAT TTTACTTTCA GGACAGGTTA 4 6 80 

AATTTGTAGA TACATCGTTA AAGTTAATGC CTCAAAACAT TGAAACGGTT AATTTTAGCA 4 74 0 

ACAATGATAG AATaCATTTC ATTCAGTCAT TATTTGAAGG ATTCAGTCAA GTATTTATCG 4 800 

40 

AAGCGAGTGT AATTGGTGGC GTATGTATTT TAATCGGCAT ATTGATAGCA TCAAGAAAAG 4 860 

CAACACTCTT AGCTGTTATA GCTAGTTTGT TAAGCTTTAT CATTGTAGCT CTATTAGGTG 4 92 0 

GTAATTATGA TGATATTAAT CAGGGATTAT TCGGTTATAA CTTTGTATTA ATGGCAATCG 4 980 

45 

CACTAGGATA TACATTTAAA ACAGCGATTA ACCCTTATAT TTCGACTTTT TTAGGTGTGT 5 04 0 

TATTAACAGT AGTGGTGCAA CTAGGTACAA CAACATTGCT TGAACCGTTT GGCTTACCTG 5100 

50 CATTAACATT GCCATTTATT ATCGTGACAT GGATTTTATT ATTTGCTGGT ATTAAACATG 516 0 

ACAAAGTAGA TGCTTGATAG TTAAATCAAA CCTAATATTG TTTGAATATC ACCTTAAACT 522 0 

ATACAGCGAA TTGTATAGTT TAAGGTGTAT TTTTATGGAT AAAATTAAGT GCATACTTAA 52 8 0 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AATATGAATG 54 0 0 

ATATGGATAA TTCCTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 54 60 

5 AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 5520 

CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACG TTACCAAAAG 5 580 

CATATGGTGG TGAAGGTGCG ACCATAGAAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 564 0 

w 

AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GGACAAATTT 5700 

ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 576 0 

GTGCATTAGT TAATAGAGCA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 582 0 

GACCAAGTAC ACATGCTGTT AAAGCTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 588 0 

ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 594 0 

20 AAAGTGTTGG TTTTTTCTTA GTAGAC 5 966 

(2) INFORMATION FOR SEQ ID NO : 23: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH : 17310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTGTGTCATC GCGAAATAGT TAGGGTCATT CATTAATCCT TTTGAACGTA TTTCATCAAA 6 0 

35 ATATAACAAT TTCATTAGTA AAGGGGACTT GTTCAAACCA GCTATAATAC AAAATAGACC 12 0 

TATAGTCACA CTGCTTATAA TATAAGAGGT AACGATCACT TTTTTGCTAT TAC CTAACTT 18 0 

AAAGSTGATC ATCCCTAAAT AGAAATAAAT GACTACAAAT GCATATTTAA CTGTAGATGC 24 0 

40 AAGAACTTCC TTAACCGTAA TAAATATCAA ATCATCAAAA AATaGCaAAC AArGCGTAAT 300 

AATCATACGA TATGTATACA AAATAATGAm AAACTGTmAA AAATGATTTG CCTTTAATAA 3 60 

ATGGTTAGCG AAAAACAGTA AATAAACTAA TATTAGTAAT GTGATAAAGT CAGCTATAGA 4 20 

AACATTCACA CCGGCAATAA CCGAAGATTG CTGAATAAAA ACCGCTAAAC CGATAAGTAA 4 80 

CAATGTTAGT AATTTACTAT TGTGTTGATT TTCCATTATA AACGTCTTCC ACTTCTTTAA 54 0 

TCATTTTCTC CTCAGTAAAA CATTCTAAAT AACGTTTTCT AGATTGATTA CTCATTTTGA 6 00 

so 

TGTAATCACT GTCTATTAAA TATTTTTCCA GGACTTTAGC AATAGTTTCG GGTTGGTTGT 6 60 
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TTATTAAAAT AAACGTATCG TATTGTGATA ATAAATGACT CGCATTAATG ACATTGCCCA 84 0 

AAAATGTGAC ATCATTTTCT AACCCAGCTT GTACAACTTG TTGCTGACAA TCATTTAATG 90 0 

5 

TAGGTCCATC GCCTATAAAT GTAAAATGCG CATGATTACT GTTATGTAAT TTCAATATCT 96 0 

CTATTGCCGC GATTAGATTT TGTGGCAATT TTGGATAAGC AAATCTTGCA ATCATAACAA 102 0 

ATTGATGCTT TGTCGGGGCA TTAATCTGTA AATCTTGTTT ATTAGGCAAC ATTCCAACTA 108 0 

10 

CTTCGCCAAT ATTGTTATGT GATTGGCTTT TTAGCGTTTG CTTAACAGCG GGAACATCTG 114 0 

CAATACCATT ATGTATTGTG GTTAATTTCA AT CG ATT AAA TCGATATTTT AACGCTAACT 12 00 

15 GTTTATCGAA ATCTGAAACA CAAATAATGC TATCTGTAAT AAGTGACATT AATTTTTCGA 1260 

TAACTAAATA TAGAAATTTT TTAGCTGGTT TAACACCCTC TGTAAAAGCC CATCCATGTG 132 0 

CAGTAAAAAC TATACGTGTG TCTTTCGATT TCGAAATGAa CTtCGCAATT CGTCcGACCG 13 8 0 

20 

TtCCAGCTTT GGAAGAATGT AAATGGATAA CATCAGGTTT AATTTTCGAG AATAACTGTG 144 0 

CTAACACTTT GACAGCTAAA ATATCTTGTT TAAAGTCAAT TGGACCTACT AAATGTTCGA 150 0 

TAATAATTAC ATTAACTCTT GCATCTAGTT GTTCAATCAT TGGTCCATGA TTGCCTACAA 156 0 

25 

TGACATAAAC AT CATTGTGT ACGCAAAAAT GGTTGGCGAG TTGAATGAGA TGTGTTTGTG 162 0 

CACCACCATT GTCTGCTTTA GTAATACAAT ATATAATTTT CAACTGTTAC AAACCCCTTT 168 0 

3Q AATGCTATAC TTT CAATTT C TTAACATGGC TATCTCATCA GATGAATAGT ATTTATAGCC 174 0 

ATGCAAATCA ATGATGG CAC ATATTTCTTA ATGCCATTTG ATACTGTCTC AAGGGATTCC 1800 

TCGTTATACT GTAACAATTG GTCACAATCT TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

3$ AGTAAATTAA GACTACCTTG AGCCTTCCCC TGTAATAACA ACCATCAATG TTCTAATTGA 1920 

TATATATAGT TCCATCATTA AACTAC CTTT ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

TTGTTGCGGT GTTAAGT CAT ATCCACCTTG AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2 040 

40 

AACAAGACAT CTTTGCTCGA AACCTATCAC TTCTGAACTA AATAATTCTA CAAATTCCGG 2100 

ACGTTCCGGG CGTGGTCCAA TAAAACT CAT TTCCCCTTTA ACAACATTAA TTAGTTGTGG 216 0 

TAATTCATCA ATGCGTGTTT TACGAATAAA CTTCCCGACA TTTGTTATAC GATCATCATC 2220 

45 

TTTATCAGCC CATTGCGCAC CGTTTTTCTC TGCGTTTTTG CACATCGAAC GTAATTTGTA 2280 

TATTTTAATT AATTTACCCA TCTTCCCAAC TCTAACCTGA CTATAAATAG GGTTTCCTGG 234 0 

50 CGAATCTATG ACGATAGCAA TGGCGAATAT AACCATAATC GGTAAAGTTA AAAATAATAA 24 00 

AACAATGCTT AAAATTAAGT CAATCGCACG TTTAATTGGG TAATAGCTTT TTCTCACTTC 24 60 

TTCTAGTTTG TCTAATTTTC TTTGATAGGC ATAACCCTTA TTATTATGGA CAGCTTCAAT 2 520 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 264 0 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 27 00 

5 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 2 760 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT T C CAT AATT A 2 82 0 

AGCGTACAGA TTGAACAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 2 8 80 

10 

CAAAAAATGT AAATGGCTTG TTATGCTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 2 940 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3 000 

15 TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCGTAA TATTTATCTA 3 060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

20 TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 3240 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 33 00 

ATGTCTTCAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 3360 

25 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 34 20 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCGACCC AATAAAACCA GCCCCACCAG 34 80 

3Q TTATCAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 3 54 0 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 3 6 00 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 3 660 

35 TTGGGTGTGT TAATACTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3 720 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3 84 0 

40 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3 900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3 960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4 0 20 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4 08 0 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 414 0 

50 TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4 2 00 

T^TTT^ATTr: ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTTCAT GTGCGGCTTG 4 26 0 
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GCCAAATTGC GCGGCAGTTT GTCTTAcTGC GTTAAATACA TCATCACGGT TTGATACATC 4 44 0 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4 50 0 

5 

CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTTCTGC 4 56 0 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4 62 0 

AAAGATCACT CCTCAAATTT CTTTCCTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4 630 

10 

TACAACAAAG GTAGCTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4 74 0 

CTACCATTAG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4 BOO 

75 TATTTTTTTA AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4 860 

ATTCAATAGG CGGTTCCGTG TTAT CACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4 920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4 98 0 

TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 5040 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

AAATAGATAT GGCTGAAACA AACCAGAGTA TTGCAGATAC AAAGTTGCAT AATACTAAAG 5160 

25 

CGATAATAGC CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 522 0 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA ATTATTGCCG 52 80 

ATATAACTAT TGTTACTATT AAATAATCAG CTCTGCTACC TGATAATAAA TAGAAAAGGC 534 0 

30 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAGCCATG AGTACTATAT 5400 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT TTCTACATTT TTTTCCATGT 5460 

35 CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 552 0 

ATTTTATCTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 

TCTGTCGATT CATCTTTTGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 564 0 

40 

AAAATTGACG AAAGGAAATT ATATAAACAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 57 00 

GCACCTCCGA TTACAGAGTA ACTTTCCATA TAAATCGCAG TAAAGATGGT TGGTAAAACA 576 0 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 582 0 

45 

ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 5 94 0 

50 TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 600 0 

AGAACTTCAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 6 06 0 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 612 0 
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15 



25 



TTCCAATTGC GCAGTTGTTC AACATCATCA TCTTGTTTAA GTAATGCCAG TGGTACTTGA 624 0 

AGATTAAGAC ATCGTCCTGA AATATTAAAG CGTGTCACAC CTGCTGGCAC AGTTTCCCCT 63 0 0 

TTATGAACAA CCGCTTCAAT TTCCTTATAA CTCAATGGCT GATACTTCAT GAGTACATCT 63 6 0 

TGTTGAGAAA GACAAGGATA TGTACCTTGT GCAATTCTCT CTACAGAACA ACAACCACTA 64 2 0 

TAACTTGCGA CAACCTTTTC CCATACTTGA AAATGTGCTT CGCCTAAATC TTTTGTATAC 64 8 0 

AAATATTGTT CTGTATCACC ATGACACATT GTAATAAATG GCGCTTCTTG TCTTGTCTCA 654 0 

GTAGTCCATG GCAAGCGATG TTCTTGTTGT AACGTTTCCC ACCACACACC AAATGGAACT 6600 

TTATGTTGCC ATGTACTAAT TGAATATTGT GTTTCATGGA TTTCTTGCAC TGGAACTTTC 6660 

TTACATCCTA ACGCTTTCAA ACTTGTATAC CGATGCACAC CATCTATAAC CATATATCTA 672 0 

CCATGTTGCA TCGCTGTCAC TAAAATAGGA TGACGTATAA AATCATCTGC TTCAATACTA 678 0 

CTTTTCGTTT TTTCCAATCT TAAAGGTTCG AATGTTTCGT GAAGATCAAT CTTATCTACT 684 0 

GGTAC CAATT TTAAATGTTC ATGAATATGA TTCAATAGTT ATTCATCCTC CTTTGTTTGT 6 900 

GTTAAATAAA TAAATTCAGG ATGTGGATGG CTTAAGAAAT CGTGATGTGA AATAGACCAT 696 0 

CCGTATGCAC CTGCATATTT GAAAACAATA ACGTCGCCTG TACTGATTGC GTCTATCTGT 702 0 

ACTTCTCTAG CAAAGACATC TTTCGGTGTA CATAATTGAC CGACTAACGT TGTGTCCTGT 708 0 

CTCGAAATTG AAACTTTTTC AAATGAATAT GGATTGTCCT TATAGCGATA AATGTCAAAA 714 0 

GGATGGTTAT GTTGCCAAGA TACCGGCAGT CTAAATTGTT GCGTACCTCC TCTTAATATG 720 0 

GCATACCAAG CACCATGTAC TTTCTTAATG TCTAGCACTT CTGTCACATA GTAACCAATA 726 0 

35 TGTGCCACAA TAAAGCGCCC ACATTCAAAG TTCAATGTCA CATCTTCCAT TTCTTGCTCA 73 2 0 

ACGATAAGTG TTTTAAAACG TTCTACAAAA TTATCCCATT CAAATTGGTT AGTTAAATCT 73 8 0 

GCATAGTTAA CGCCTATGCC ACCACCAAGA TTGATATGTT TGAGTGGAAA TCGATGTTTT 744 0 

TCAGACCATG CCTTTGCTTT TTTAAAATAA AGTTTCACTA CATCGACATG TAAATTCGAG 750 0 

TCTAAATTGT TAGAAATAGA ATGAAAATGA AATCCATCTA GATGAATCTT TGGCATTGCG 7 56 0 

AGCGCAgcTT cAATGACATC ATCAACTTCG TCTTCAGAAA TACCAAATTG TGTTGGGCGT 76 2 0 

CCTGCCATAT GCAACGTTGC ATTGGGAAAT GGTCCTGCTA AATTAACACG CAATAAAATG 76 8 0 

TGTTGTGTCT TATCTTCATC TTCTAAGATG GCATTTAGCC GTTGTAATTC ATGCATACTT 7 74 0 

TCAACATGAA TACGCTGAAC ACCTTCACTT ACTGCATATC TTAGTTCCTC GTCTGTCTTA 7 8 00 

CCAGGGCCAC CAAAAATAAT ATGATTTGCT GGTTTAAAAG CAAGACCTTT TGCTATTTCA 78 6 0 
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10 



15 



20 



25 



TGTTGCAAAT GATGTTCCAG TCCGACTAAA TCATAGATAT AATGACAAAC TGGATGAGAT 804 0 

TGTGCTTTTA ATTGTTCAAT AACAGGTTGA ACTATACGCA TTAGCCTTCA TCCCCTTTCT 810 0 

GTTTAGACGT CGCTAGAGAT GCACTTAAAT GGCGATATAT TTTTCCGCGA TCATCACCTA 816 0 

AAATAAATGT TTGTACACCT TGTGCCTGCC ATTTTGCAAT ATCTTCATCT TCACGTGGTA 822 0 

ATGCACAAAA ATGTTTACCA TGTGCATTCA CAACTTCAAA AATATGTTGA ACATGTGATG 82 8 0 

TTACTTGATC ATCACGCGTT TGCCATGGTA TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 834 0 

CTTCGACTAT CATGTCTAAA CCTTCGACTT GTGCTATATC GTCAATGGCC ATAACCCCTT 84 00 

CAACATCTTC TATCATGGCA ATCACCATAA TATGCTCATT AGCCATCTCC ATTGCATCAA 846 0 

GTAATGGTGT ACGTCCAAAT CTTGCCATGC GACCACCATT CAAACTTCTT AATCCTTGCG 852 0 

GGTAATAACG ACTTAATTTC ACAATATGCT CAACTGTCTC ACGATCTTTA ACGTGTGGCA 858 0 

CAATAATACC TCTCGCACCC ATATCCAACA CTTTAATGAT ATCTCTATCT ATCACTGCAG 8 64 0 

TGACACGTAC AATTGGTATA ATATGCGCTG CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8 700 

TCTCATCATT AATCGCCACG TGTTCTGTAT CAATCACAAC AAAGTCATAC CCGCTTGCTG 8 76 0 

CGATAACCTC GATCATCAAT GGGTCCGGTA TAGAATTAAA AATG CCATAA ACTGAATCAC 8 82 0 

CATTGTTTAA TCTATGTTTC AGAGATAGTT GTTGCATCAT TGATACCTCC TACACCTAAT 8880 

GG ATTTGTAA CATGATGAAT TCTTAACTCG GAGTCACTTA ATAATCGACG TGTCGTTAAC 8 94 0 

TTTTCAACTT GAATCGTAGG TTCAAACAAA TCGAAATGTT GATAGTTATT CAACTCTGGA 900 0 

AATGCTTCTT GATACGCCTC GATGATGCCT TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

35 ATACCATATT GCTTTTCAAT AAATAAGATG ATTTCGGCGA TATTAATAAA GAAAAATGCA 912 0 

TCATGTAAAA AGTCGCGTAC TAAACGTTCG TCATCTGTTT CAATAAATGA ATTACTATTC 918 0 

ACTTFTTTAT GTGCTTCTGG CATTGGCTTT AATGTCAGGT GTGAAGCAGC TTCACTTAAA 924 0 

TGctCACGCT TAAAACGAAC ACCATCATGG AAATCTTTTA AGGCAATACG TGTAGGCCAA 93 00 

CCATTTTCAT GAATGAGCAT CATATTTTGT GCATGCGATT CAAAGGCAAT ACCGTGATAA 9360 

TAAAGCATAT GAATCATTGG ACGAATCGCT ACAGCTAAAA ATTGCTTTGT CCAAGCTTCA 94 20 

GAACCATATT GTTTAATCCA ATTTTCAATG AATGGTACAC CATCCTTATC ACTTGCATAA 94 80 

AGTGCATTAA ATGGTATCGC ATCCTCTTCA TCGATTAACA TATGATATAT ATTTTCACGC 954 0 

CATATAACAC CTAACGCACC ATAAACTTGA GTTTGTTTAT AAGGCGAAAG TTGTGTATTT 9 6 00 

AAATAAGACT GTCCTAAGAC TTCCCCTAGA AAAACTGTCT TTAATTCATC TTTTAAATAC 966 0 

ATATCTTGTT GCTGTATCTG CTTTAACCAA TCCGTAATTT GCGCTGCATT TTCAATTGTA 972 0 
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TATTTTGTCG TGTCTATTGG CGACATCGTA CGAATCGATT GTTGAGGGTG ATATAGCTCA 9 840 

TCACTTTCCC CTAACCATAG TACTGTGCCA TTAAGCCTTT CTTCAGCCAA ATCAACTTGG 9 900 

5 ATGACATGTT CAAACTGCCA TGGGTGTACA GGTATCATCT CAACATCATT TACATGTTTG 9 96 0 

CCAGATGCTT CAATTTGCTG TACAAAATGT TCATAAGTCT TATCGCCAAC TTGTTGACGT 10020 

AACATTTCGT TAACTACAAC ATTTCTTGAT ACCGTCGTTT CTACTTTATC TTTGTCGATA 1006 0 

10 

GCTAACCACT GCAGTTTAAC GTTTGGTACA AAATCAGGAC CAAATTTCAA ATTATCACTC 1014 0 

AACGTAAATC CTAAACGTGA TTTGTAACTT GGATGATACT GATGCCCTTC CATCGCATAA 10200 

15 AATTCATAGT CGTTAAATGT CTCAGGTGTT GCTGGTGGGT TTGATTCTCG ATACTGCATA 10260 

CTTTGCGTAT CTTTTAATTC TGTCTGTAAT AACTCGACAA TAAATTGTTC TAGCTTTTCA 10320 

TCATTTTTAG GAAATGTAAA TACAACCTCT CTCAATAATT GTGTATAGTC TGTTGTTGTA 10380 

pa TCTGCCTCAT CTCCTACGAC ACGCTCAATT GGTGATGTGA TACGTATACG ATCAAAGCTA 10440 

TGTGTCTTTT CAGCAGTAAA ACGATACTCT GAATCATGTC CTTCTATTGT AAAATG AC CG 10 500 

ACACCGTCTT GATATGACGC TTTATACACA ACAATATTCT CATAAATAAG TGATGATACC 10560 

25 AGTTGGTGCA TCACTCTAGT CTTTACACGA TTAAGAATTG TTTGATT CAC AATACGATAC 10620 

CTCCTTGTTA TGACAAATTG GATTTGGTAT ATGTGTATAA ATAGGGTTTG CACCACAATC 10680 

ATTCAATTTA CTCATCAAAT TCGCTTTAGC CGcAATGGTC GGCGTTTGAT ATAAATCTTC 10740 

30 

TACACAGTCA ACAAATACTG CGTTATTCGC GTATTCTTTT TTCCAAGTCA TAAGACGATG 10800 

CGCTACAAGT TGCCATAACA CAACTTCATT TCTAGTCGCT TTACCAATAG TTGATACTAA 1086 0 

35 ATGTCCTAAG TGATTTACTA CAACGTAATA TTTAAGACGA TGCCATGCTT CATCATGTGC 1092 0 

ATATACAACA GGGCTTGATG CTGCCACAAC ATTTGGCACA AGCTGTTTTT CAGTAGCAAT 10980 

CGTTCTAGAT AGACAAATGC CTT CAAGATC TCTGACAAAG CATACGTCGG GTATGCCATC 11040 

40 TTTTAATTCA ATTAATGTAT TTTGTACATG TGCTTCTAGA CTAATGCCTG TGTTACTAAA 11100 

CAGCTTTAAT ATCGGCAATA ATGTACGATT CAAATAACAT TCAAGCCATG CTTCTGGTGC 11160 

T AAA CCA CTT TGCTCAATCA CTTGTGATAA CTTAGACATC GGTGAATCAG GCATCGTTTC 112 2 0 

AAATAATGAC GCCAATACAT GAATATCTTT ATCAGCATGG TAATTCGGTA TCCCTTCACG 112 8 0 

AACAATCATG GCACTATTTG TTAATAAATC CATTTCAGGT TCAACTGTTT GCCCTAATGG 113 4 0 

ATTCGGTAAC AATGCACGAT ATCCTTCTTC AAACATCAAT TTAAAATGGG GTGTTTCAAC 114 00 

50 

CTCATCTTTG ACTGATGCGA TAACTTGCGC GGCATCAATT GTCCGTTCAA TCTGTTCAAG 114 6 0 
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GCCAAGGTCT TTTATTAAAC CTTGTTCACT ATATTGCATA TACTGTGGAT GCTGTCGCAA 11640 

CACATTGATT TGATAAGGAT GTGTTGGTAA TAAAATAAAA TCTTTGGGTA TCTCTGATAT 1170 0 

5 

ATCTATGTCT GCTAATTGAT ACAACACTTT CTCAACCTGA TCTTCTTTAC CTTCTACATA 11760 

GCGCGTGAGC AGAACATCTT GATGCACAGC TAAATAATGC AATTGGAATG ATGTATGACA 11820 

TTCGGGTGCA TATTTCTCTA AATCTGCTTC TGAAAACCCA CTTGCACTCT TAGGAGTCGG 11880 

w 

ATGAAATGGA TGACCTAAGT ATAAAGATTG TTCTGAAACG ATATAACGAT CCTCTACGTA 11940 

GTCTATTGTG TTACTTTGCA AATAACGTGC CGTGCGATGA ATGCTATTAT CGATGTCAGA 12000 

75 CATAATTTGC GCCATATGTT GTTGCACTGC CGTTTGATTA TCTGCACTTT GAGCCATATG 12 060 

TTGCAAAATA CGCGCAATTG CTTCTTTATA AGTTGTTATT TTTTTACTTT TTCCATCGAT 1212 0 

AAGCCATACC TCTGGATGAT ACATATGATG CCCCATCGCA GACCAATAGC GAAATTCACC 12180 

20 CGTTAAAGTT TCGAGCTCTG ATAATTGTAT AGACCATTGA TGATTTTGAG GTGGTACTTG 1224 0 

ATATAAATTT TCTTCTCTAA AATATTCATT TAAAATGCGT TCGATAGCCG CATACGCTGC 123 00 

ATGTTGTATT AATTCTTTAT TTTGCACTTT TTTGTTTCAA CTCCCATAAT TTCATTAATG 12360 

25 

TGTGATCGTT GATTTGATTA GTGATGGTTG AACAAATTAA AAATAAACTA CTTACTGCAA 12420 

ATACTACGCC CATAACGATA AACGTAGTAG CTGGTGTAGT ATAACTTGTA ATGGCAGCGC 12480 

cACTaAGACT GCCAATAATT TGACCAACAA CTAACATACT GTTCGTCGTT CCAACAAATG 1254 0 

30 

TGCCTTTAAG TTGTTGATGA CACGCATTCA CGACAACAAA CATGACACTT TGAATCAATG 126 00 

CACTATATGT TAATCCTTGA AGTATTCTTG CAGCCATTAA AAACTCTATA TTCGTCGCTA 126 60 

35 AACCTTGCAG TATCGCACTA CAACCACATG CAATCGTGGC AAATATATAT ACTGATTTAA 12720 

CATATGATTT ATCATTAAAG CGTCCCCATA AAGGCGCGCT TAATATCGAA GCCGTCCAAA 12780 

ATGCGGACTG TAAAAATCCA ATCACACTAC GGTCATCTAT CGCTGTATGA TTCACTGATG 1284 0 

40 

AAGCAAGTGG TGATAATGCA GTTAGCATGC CATACATAGC AAAGTTTGCT AAAACGCCAA 12 900 

CGATAATAAA TCGACATGTT TGTTGTGTGC ATAATAGACA TTGAAATGAA CGGCGAATAC 12 96 0 

CTTTATTAAT ATTTGGTGTT TGTGATTTTG GCATATGTGT CGTTTCAATC AATTTTAATG 13 020 

45 

CACCGAAAAT ACAGACAATA AAAGTAATAA CGGCAATACT CATCAGTAAC GCACTAAAAC 13 0 80 

CTAATATCGA AGCTGTAACA CCGCCAATTA ATGGCCCCAC AAGAGACCCT GCGCTGACTG 1314 0 

50 AACTTTGCAG TCTTCCTAAT ACCTTTCCAC GATCTTCAGC TGGCGCCTCT GCACTCGCAA 132 00 

ACGCACTTGA TGCATCAACA ACACCACCAA ATAGTCCCTG CAATAACCTC ACAAGTACAA 13260 

ACTGTAATGG TGTCGTACAC AATGCCATTA AAAATAAGCA TACCGCCAAA CCAAGTAACG 13320 
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Ct ATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13 440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13 500 

5 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13 560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13 620 

10 CCTCCACCAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 136 80 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13 740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13 80 0 

15 AATAAATCTT GAATTGCATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 13 860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13 920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13 980 

20 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14 040 

TTTTGACCAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

25 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 14280 

30 CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AG CTGTTT CA 1434 0 

TCACAAACCA TGACATACTT AGCTAGTGCT TCATCTTTTT CTATAAGCTG ACGTAATAAT 144 00 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 144 60 

35 GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14 520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14 5 80 

TCACTAATCT CTTTCGCAAA GACGTTCGGC AGAATATGCT GATATTGCCA AGGATGTACC 1464 0 

40 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14 760 

45 ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14S20 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 14 88 0 

CG AT CTTTT A AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14 940 

so TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15 000 

' -™ a^^A^ AT^ACAATTr ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 
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TATATCAAAA GCGTTTGTCC GTTTTCTTTA GTAATCTCAC TATTCGATAC AATTCCGGCT 15240 

ATATCTTCAA ATAATAATGC ATCAACTAAA TCTCTTAATA TTATCGCTTG TGCTGTATTG 15300 

5 

ACTGCTGTAT GATTCTGCAA TGTTCAGACA CCTCGCATTC TTAATATAGG TTCAATGTTG 15360 

TCCCAATATT TTGTTGTTGT GCCTGTTGAT AAATAAAATA AGCACTTGAA ATATCTTCGA 15420 

w TAGCCATACC CATCGGATTA AGTAATATGA TCT CAT CATC GTCTTCACGT CCTGGTATGT 15480 

CACCTGTCAC AAGTTGTCCT AGTTCAGCAT GAAGAGCTTC TTTGCTGAAT TTACCTTCTA 15540 

ACACCAATTG GTTAATAGTT TTCTTTTCTC GATTACATTG TGACCAGTCA TCTACTACGA 15600 

15 CTTTGTCAGC TTTAATAAAG ACTTCTTTAT GCACATCCAT GATAGAAATG TTGCTAATAA 15660 

ATGCACCCTT TTGTAACCAA TCATATTCAA TGTATGGTTG ATCCGTTACG GTACATGTAA 15720 

TGACTACTTC ACCATTTGAT ACTGCTTCTT TAGCATTTTC TGTCGCAATA AAATTAATTT 15780 

20 

CCGGACGCTG TTGTTGCCAT CTATCAACAA AGCGTGCACA TGCTTCAGAG AATTGATCGT 1584 0 

AAACAAACAC GCGTTCAATA TGATCGAATT GCTCTAACAT ACTTTGTAAT TGCTTGTCTC 15 900 

CGATTAGCCC GCATCCAATG ATTGTTAAGT CTTTAAATCC TTTTTTAGCC AAATGCTTTG 1596 0 

25 

CTGCAATCAC TGAAACTGCT GCAGTACGCA TACTACTAAT TAAACTTGCT TCCATAACTG 16 02 0 

CAATTGGATA ATTCGTTTCT GGATCATTCA AAATAATGAC GCCACTTGCA CGCTCCATAT 1608 0 

30 TACGTTTCGA TGGATTGTCG TGCTTACTAC CTATCCACTT AAT AC CTGAA ATTGCGTGTT 1614 0 

CACCACCGAT ATGACTTGGC ATTGCAATAA TTCGATCTGC GATGTGTCCA TTTTCAGGAT 162 0 0 

CCtGTCTTAA ATACGGCTTA AGCGGTTGTA CAAAATCATT GTGCGCATGG GCTGTTAATG 1626 0 

35 CTTCTGTTAA TGCGTCCACA TAAACTTGTG AATGATTACC TCCCGCTTGT TCAATATCTG 163 2 0 

ATCTATTTAA ATACAACATC TCTCTatTCa TTCTGaTTTA ACTCCTTGTC TTGATTTCAT 163 8 0 

TTTTTCTAAC CATGTATCTG AATAAACTAA ATCTAAGTAA CGATCGCCTC GATCTGGTAA 1644 0 

40 

AATCGTGACA ATTGTTGCAC CTTCTTCAAT TGACGTTATC AACTGCTCAA TCGCTGCAAT 165 0 0 

AATCGAACCT GTTGAAcCTC CGGCAAATAT GCCTTCATAA TCAATCAGTT TTCGACAGCC 165 6 0 

CAAAGCAGAT TGATAATCAT CTACATGGAT CACTTGATTA ATTTCTGATC TATTCAATAT 16620 

45 

TTCGGGTACA CGACTAGCAC CGATACCAGG TAATTCTCTA TTAATAGGTT TGTCACCAAA 16680 

AATGACTGAC CCTTTCGCAT CAACAGCAAC AATTTGTGCG TTTGGATGCA CTTCTTTTAT 16740 

so TTTTCTACTC ATACCCATAA TGCTACCTGT CGTGCTGACT GGCGCGACAA AAT AAT CT AT 16800 

AGGTTGCTTA ATTGTTTCAA CAATCTCTGT GCCTGCACCA TGATAATGGG ATTGCCAATT 16 86 0 

TAACTCATTC GCATATTGAT TAATCCAATA TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 16 920 
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TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 17 04 0 

AACACACGTG A G TTTTAATC CCTTGATTTT AG CT ATCATT GCCAACGCAA TGCCTAAATT 17100 

5 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT T AAT AC CATG 1716 0 

TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 17220 

CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TGAACCATAG GTGTTTGCCC 17280 

10 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 
(2) INFORMATION FOR SEQ ID NO: 24: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5423 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG 6 0 

25 

TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 120 

TAAGTTTCCT G T AAT ACT AG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 18 0 

30 TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 24 0 

TTTAACGAAA CCTAACATTG AAT A CAT AC C AA CATC CATG AATTCACGTG AAGGTGAGTG 30 0 

AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 36 0 

35 AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 420 

TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 48 0 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 54 0 

40 

CGTACtTGTt ATAGTAGATA CCCaTnGCAT ACCTTTAGTG ACmATGAAGT TCCAAGCTTG 6 00 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 66 0 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 72 0 

ACTCCATGCC GCTTGTAACG CAGTAGATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 7 80 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 84 0 

so TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 9 00 

— riTA^rGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 960 
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ATAAGCGACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCG ATACTGCTGT 114 0 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 12 00 

5 

AACATTAACC ATAAACGTTT TTATCGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 126 0 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAGCAATTA CACCAATTAC 13 2 0 

w TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAG CATTT CT 13 80 

AATTAATCCA ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 14 4 0 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT TGTTAATCCC GCTCTTAATA AACCGAACTT 1500 

15 ACTTACTAAT GCAATGrTTC TAC CTATTAA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 156 0 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 162 0 

TGCTGTAGCT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 16 8 0 

20 

GCGTAATACT GCACTAGCTA TAGGAGCCAT TGCTGTTGCG AATGCArmTA ATCCTCTTGC 174 0 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 18 00 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 18 60 

25 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 192 0 

TCCAACGAAA ACATTTTTGA AAAT ATT AC C AATGATAGGT AAGTTTGTTT TTGTGTATTC 198 0 

30 AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 2 04 0 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTG CTAATT GCGTGAATAC 2100 

ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC 216 0 

35 ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA CTGCTATTTT CAGCC CATTT 222 0 

AAGCACGCTT TGAGACGCTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 22 8 0 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA AC CATTT GCA AGAGTTGTGA AGATAGCGGA 2 34 0 

40 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTTTTTTGTA 24 00 

TTCGTTTGTT GCTGAGCTAG CTTGTAAAGT GCCATCATTA AGCATCTTTA TAGCGCTGAT 24 6 0 

AGCCATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGGCAC CACCTAAAGC 252 0 

45 

AAGTACACCA CCAGTTAACA CTTTGATAGC GTTTAATAGC GCAAATACTA CAGGTACTAC 2 58 0 

GCTCGCTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 2 64 0 

50 AGAACCTACA GTACCGAACA CACGGAACAT ATTAGCTAAA TTCCCCATCT GTCTTTGAAA 27 0 0 

ATTGT CATTT GCTTTTATTA TGTAGGCATA AGCTTTCTTT AAACCATTAG TATCGACATC 27 6 0 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTT AAA CG CATAAATAGT 28 2 0 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA GTTAGCAACA CCATTGTCCA CGTCTATAAT 2 94 0 

AGCTTTGGCT TTAGACCTAT TTAATGCTTC GAGACTAGCT TTAGATACTT TTAACACTCG 3 000 

5 

ATTGAATTTA CTGTTATCTG CATTGACGTC AATATTGACA CGTTTCTTTT CTAATTCTGA 3 06 0 

TAATTTAGCT TCTGTTTCAG CGATATCTTT AATCAACTTT TGTTTTTGCA ACTTAACTTC 312 0 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

w 

TTGTAAATCT TGTATACTAG CATCTAATTT AGCTTTTACA TTTTTGTTAC TAAAGGCATC 3 24 0 

TAAAGACTTT TTAGCAACTT TGATAGTTTT TTGTAAATTT TTATCGTTAG CGTTTAATTC 3 3 00 

j 5 AACATCTTTA GTTTGATCTG CTACTCGTTT AAATCTTTGC ACAGACTTAA CCGCACTATC 33 60 

AATTTGCCTT TTGAATTTGG CTACACTAGC TTCAATAGTC GCTTTAATTT TATATTCCGT 3420 

CACATTAACA CCTCTCTTTC TATTGCTTAT TAAATTCTGC TATAACTTTA AAGAATTCAT 34 80 

20 TATTTTGTGG TTCGTATTCA TCACGTTCGC TACTAAATCT TATATCTTTA CCTTCGTTAA 354 0 

GCCGTTGGAT ATTTTCTTCA TAAGGCAATA CGTCGTTTGC ATTGTTAAAA ACATATTCCT 3 6 00 

CTTTAGGTTT ATTTTCTGTC CCAACATTTT TAGTAGCTGC AGCATCACGA ATAGCAAACG 36 60 

25 

CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA GCATTTCATA CTCTTTCGCA TACATTCGAT 3 72 0 

AGTT AT ATT C TGTTAATGTC ATTTGCTCAA TAACGTTCAA ATCTGTAATA CCAAGTGTTG 37 8 0 

ACATACAAGT TATAACGATT CTGTCGTAAG TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3 84 0 

30 

TTCCACTACT TCGACTAGGT TTCGGGTCAT AGGTCGCTTT CCCAAcTCCG TTAAAATATC 3 9 00 

CGAACCGAAT TCTTCTAGTC CGATATTTTC TGCGATTTCA TCTAATGCTT CAT CAATGTT 3 96 0 

35 ATTAATAGTA ATTGCTTGTT TTTTTAAGTG AGATGTAGCT GCGATTAAAA cTTCGCCAAT 4 02 0 

CACAACCGGA TTTCCACTTT CTAAACCTAC AGGCAACATT GATACACCTT GAC CGATAG A 4 08 0 

AGCTTGTTCA ACTTTTAAAC CTAATCGGTT ATCGATTTCT CTTAAAAATT TAAAACCAAA 414 0 

ACTTAATTCT AATGACTTTC CGTTAATTTC TACATTCATA ACTTAAAATC T C CATT CAT A 420 0 

ATTAATTTAA ACAAAATAAA mArGCTTAAC GCCCTATTTT TATACCTCTC TTGGTGCAAC 4 26 0 

CGGTGGTGAA TCTACTTTAG GTTGTGGAAT TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4 32 0 

45 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcC AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 43 8 0 

AGGTAGTGTT GCAAATCCAC GTTGGAAACG ACCATTCACT C CAT ATT CAT ATT CAT ATTC 4 44 0 

so ATCAATACCG TTAGCTTCTG CTTTTAATTC AAATTTATTG TGGAAACCTT GGAAATATTT 4 50 0 

CGCTTTAAAT TTAGCGGAAT CCCCATTTTT GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4 56 0 
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GTCCATTGTA TCCTCTGTAT 
AAGCATTTTA GTAGCATCTA 

5 

ACTATTTTTC ATATTTGCCA 
CGTTACTGAT GTGTGTAGCA 
AGTATCTTCT TCTTCAAAGT 

10 

ACATCTTTTA ACAAGTCCGT 
AACTAAATTC GTATCGCCAG 
75 CATTTTTATT ACAAAAAAAG 

AGGGAATCCA TATCCTTGTA 
CATGCTTTTA TCTCCTATTC 

20 

ACTTCGTATA CCGGCCACAT 
AAACCACTAT AAGCTGCGTG 
CTGATATTGC GTGATaAATT 

25 

(2) INFORMATION FOR SEQ ID NO : 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAGCGTTAAA TGATGTGACA ATTCATGAAT GG TTAACTGA 12 0 

40 

TGAACTAAGA GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 180 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 24 0 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCAT CTG ACGAAG CAAT 3 00 

45 

TGAAATTGAT TATGACATAG AC CAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 3 60 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 4 20 

so TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 4 80 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 54 0 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC TTTATTTGGA GATTCTATGA ACGTTGCATT 600 



CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4 74 0 

CTTTTTCGCC AGCTTTTCTA AATAAAATAA TACGATCATT 4 8 00 

TT CAAT ATT C CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4 860 

ATTCTTGATT GGTAGTATCA TCAACTAACT GTGTGATGTT 4 92 0 

CATAATCGTT TGTTTTAACG CTAGGTGTTA AAT CAT CAAT 4 980 

CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 5 04 0 

AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 516 0 

GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 5220 

AAACAACGCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 52 80 

AAAAGGTTCA GCCTCCATGT ATCGAGTACC AAATTCTAAG 5340 

CGATGTGATA GTGTATTGCA AATCGCCAGT TTTTTTATAT 54 00 

ACC 5423 



55 



EP0 786 519 A2 



TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 72 0 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 780 

5 

AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACCAAGAAC 84 0 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 900 
TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 96 0 

10 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 102 0 

TTCTGAGCAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 10 8 0 

15 TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 114 0 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 12 00 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 1260 

20 ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 13 2 0 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 13 80 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 14 4 0 

25 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 15 00 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 15 60 

TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC ATCGTAGTTT 162 0 

30 

TGACTAATTG CCATGCTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 16 8 0 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 174 0 

35 TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 18 00 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 18 6 0 

ATAAACGTAG AGAAGCAATC AGACAACAAA TTG AT AG CAA TCCCTTCATC ACAGACCATG 192 0 

40 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 19 80 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 204 0 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

45 

CGCAATCAAT TT TAG AT ATT A CAT CGGATT CTGTTTTTCA TAAAACTGGA ATTGCGCGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 22 2 0 

so TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 22 8 0 

OAGCAGAAGT ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 234 0 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2 52 0 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2 58 0 

5 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACG T A AAAAAGATAG CTCAATGGTA 2 64 0 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA GCAGATGGAT GTGTGTCAGC AGGTAATACT 270 0 

GGTGCTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 276 0 

10 

GCTTTAGTAG TA ACATTGCC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 282 0 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 28 8 0 

75 GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2 94 0 

CCAGCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TGATCATTCA 3 000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 306 0 

20 GTTACCGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 312 0 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATTACTCAGA ATACGGTGGT 324 0 

25 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 33 00 

GCTTTTTATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3 3 60 

3Q ATGAAAGAGA CTGTAGGTGA AtCAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 3420 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT GTTTAACAAC AATGATCAAG CAACTGAAAT 34 8 0 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 3 54 0 

35 AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

ATTATTAGCA GCGCTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCATA GTTTAGGTGA 3660 

ATATTCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 372 0 

40 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 378 0 

ATTGGGATTA GATTTTGATA AAGTCGATGA AATTTGTAAG TC ATT AT CAT CTGATGACAA 3 84 0 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3 90 0 

45 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA ATCATTAGGT GCAAAACGTG TCATGCCTTT 3 96 0 

AGCAGTATCT GGACCATTCC ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4 02 0 

50 TTACATTAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 40 8 0 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TATATTCACC 414 0 

AGTACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 42 0 0 
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AACATCAATT CAAACTTTAG AAGATGTGAA AGGATGGAAT GAAAATGACT AAGAGTGCTT 4 320 

TAGTAACAGG TGCATCAAGA GGAATTGGAC GTAGTATTGC GTTACAATTA GCAGAAGAAG 4 3 80 

5 GATATAATGT AG CAGTAAAC TATGCAGGCA GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 4 44 0 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG CGATTCAAGC AAATGTTGCC GATGCTGATG 4 500 

AAGTTAAAGC AATGATTAAA GAAGTAGTTA GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 560 

w 

ATAATGCAGG TATTACTCGC GATAATTTAT TAATGCGTAT GAAAGAACAA GAGTGGGATG 4 62 0 

ATGTTATTGA CACAAACTTA AAAGGTGTAT TTAACTGTAT CCAAAAAGCA ACACCACAAA 4 680 

15 TGTTAAGACA ACGTAGTGGT GCTATCATCA ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 4 74 0 

ATCCGGGACA AGCAAACTAT GTTGCAACAA AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4 800 

CGGCGCGTGA ATTAGCATCT CGTGGTATCA CTGTAAATGC AGTTGCACCT GGTTTTATTG 4B60 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4 920 

CGTTAGCACG TTTTGGTCAA GACACAGATA TTGCTAATAC AGTAGCGTTC TTAGCATCAG 4 980 

ACAAAGCAAA ATATATTACA GGTCAAACAA TCCATGTAAA TGGTGGAATG TACATGTAAT 5 04 0 

25 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG GTTGACTGGT CATCCAATGG AGAATTGTCT 5100 

GACCTAGTCA ACTTTGCGGG GGAAATTCTA AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5160 

CCTAAGAAAC ACTAATCAAT aAATTGwTAA GTGTTTCTAA AATTTCTACT TGTTTTTTAG 522 0 

30 

AATTTAAAAT GGGAAAATAT AGTAGTCTAT G TAT AGO CAT TTTTAAAGGA GGTGAATCGA 52 80 

CGTGGAAAAT TTCGATAAAG TAAAAGATAT CATCGTTGAC CgTTTAGGTG TAGACGCTGA 534 0 

35 TAAAGTAACT GAAGATGCAT CTTTCAAAGA TGATTTAGGC GCTGACTCAC TTGATATCGC 5400 

TGAATTAGTA ATGGAATTAG AAGACGAGTT TGGTACTGAA ATTCCTGATG AAGAnGCTGA 54 60 

AAAAJ£TCAAC ACTGTTGGTG ATGCTGTTAA ATTTATTAAC AGTCTTGAAA AATAATAAAT 552 0 

40 CTTACATCTG GGTCGTCAGT ATTGTCGACT CAGTTTTTTT CTTTAATTAT CAATAGTTTT 55 80 

AACGTAAAAT TAAAGATGAT TCAAGAGCAA CACATAAAGG AGATAAAATA ATGTCTAAAC 564 0 

AAAAGAAAAG TGAGATAGTT AATCGTTTTA GAAAGCGCTT TGATACTAAA ATGACAGAGT 57 0 0 

45 

TAGGCTTTAC TT AT CAAAAT ATTGATTTAT ACCAACAAGC ATTTT CG CAT TCGAGTTTTA 57 6 0 

TTAATGATTT TAATATGAAT CGTTTAGACC ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5 82 0 

^ CGGTATTAGA ATTGACGGTT TCACGATATT TATTTGATAa ACATCCCAAC TTGCCAGAAG 58 8 0 

GGAATTTAAC AAAAATGCGT GCCaCTATTG TATGTGAGCC CtCACTkGTA ATATTTGCGA 5 94 0 
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ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 612 0 

AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 6130 

5 

AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 6 24 0 

TATTCACTTC A 62 51 

(2) INFORMATION FOR SEQ ID NO: 26: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
is (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: J 
20 ACCTACTGAA GTTGCTAATT TTTTGGAGCA 

AGATAAAAAA CAACTTGAAA AAGTAATCGA 
AGACGTGgCA TCAATCTGTA AGTGaTGCTT 

25 

CTAAGCAAGC TGCAGAGAAA CAAGCTGAAG 
ATcAAATGGT TGGTGACGCG GTAGAAAAAG 

3Q TGAAACGTCA ATCAAAAGTA TTTAGATCGC 

ACTTATTAAA AAACGAAGAT TGGGATTACT 
TGACGCTTGA AAATATTCAT CATTTGCATG 

35 CAAATGCACA AAATAATGCA TCAAATACAC 

AAACAACTAA GAAGTAAGAA TTAAATAAAG 
CAGCGAATTA GGTAATGGTG AGAGCCTAGT 

40 

TAATATTTAA ATAATGTAAT GAGAGAACTC 
GCAATCGTCC CTTTTAATTT AACTTAGAGT 
GATTACAAAG AAACGTTATT AATGCCTAAA 

45 

AACAAGGAAC CGCAAATTCA AGAAAAATGG 
GAAAAAAATA AAGGTAACGA AACATTCATT 
50 AACTTACATA TGGGACATGC CTTGAACAAA 

ACTATG CAAG GGTTCTATGC ACCATACGTA 
GAACAAGCAT TAACGAAAAA AGGTGTTGAC 



!EQ ID NO: 26: 

ACTAAGCACT GAAATTGAAC GT CTTAAAG A 60 

AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

TGATACAAGC TCAAAAAGCT GGTGAAGAAA 180 

CGATTATAGC TAAGGCAGAA GCGCAAgcTA 24 0 

CACGCCGTTT AGCATTCCAG ACTGAAGATA 3 00 

GTTTCCGTAT GTTAGTTGAA GCGCAATTAG 3 60 

TGTTGAATTA TGATTTAGAC GCTGAACAAG 4 20 

AAAATGATTT AAAGCCAGAT GAAGTTGCAG 480 

CAGACAATAA TCAACAATCC AATGATTCAG 54 0 

ACAGACGCGT AATATACATT TAACTTTTCA 600 

AAAAGCATGT ATGTTATATC ACTGG CTTTT 66 0 

TAAGTTGAGT TAATAAGGGT GGTACCGCGA 72 0 

TTTTTAAATT TTTAAGGAGT GAAAAAAATG 7 80 

ACAGATTTCC CAATGCGAGG TGGTTTACCA 84 0 

GATGCAGAAG AT CAATACCA TAAAGCGTTA 90 0 

TTACATGATG GCCCACCATA CGCGAATGGT 96 0 

ATTTTAAAAG ACTTTATTGT ACGTTATAAA 102 0 

CCAGGTTGGG ATACACATGG TTTACCAATT 108 0 

CGAAAGAAAA TGTCAACAGC TGAATTCCGT 114 0 
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TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 126 0 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 132 0 

5 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 13 80 

GATAAACGTT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 144 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

10 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG cGAAAAATAT 156 0 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAGCAC TGGaTTGGGA TAAAGCATCA 162 0 

15 ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 174 0 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 180 0 

20 GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 18 6 0 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 1920 

GGTGCACTAT TAAAATTAGA CTTTATTACA CATAGCTATC CACACGACTG GAGAACAAAA 1980 

25 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 204 0 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

3Q TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 216 0 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 222 0 

CATGTTGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 22 8 0 

3$ GACTTACTAC CAGAAGGATT TACACATCCA GGCAGCCCTA ACGGTACATT TACTAAAGAA 2 34 0 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT GTTGGAAACA 24 00 

AGAOCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 24 6 0 

40 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 252 0 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2 58 0 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 2 64 0 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2 70 0 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 276 0 

so TTCAATCCTG ACACAGATAG CATTCCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2 820 

~- AAA-^TT TACGTGAATT TACTGCAAGT ACGATTAACA ACTATGAAAA CTTTGACTAC 2 8 80 
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CAAACAGTGT TATATCAAAT TTTAGTTGAT ATGACGAAGT TGTTAGCACC AATCTTAGTG 3 06 0 

CATACAGCTG AAGAAGTTTG GTCTCATACA CCACATGTTA AAGAAGAAAG TGTTCACTTA 3120 

5 

GCAGACATGC CTAAAGTTGT AGAAGTAGAT CAAGCTTTAT TGGATAAATG GCGTACATTT 3180 

ATGAATTTAC GTGATGATGT GAACCGTGCA TTAGAAACTG CTCGTAATGA AAAAGTTATT 3 24 0 

10 GGTAAATCAT TAGAAGCTAA AGTTACGATT GCTAGTAACG ATAAATTTAA TGCATCTGAA 33 00 

TTCTTAACTT CATTTGATGC ATTACATCAA TTATTTATCG TGTCACAAGT TAAAGTTGTA 3 360 

GATAAGTTAG ACGATCAGGC AACAGCTTAT GAACATGGTG ATATTGTCAT CGAACATGCA 34 2 0 

15 GATGGTGAAA AATGTGAAAG ATGTTGGAAC TATTCAGAGG ATCTTGGTGC TGTTGATGAA 34 8 0 

TTGACGCATC TATGTCCACG ATGCCAACAA GTTGTAAAAT CACTTGTATA ATTGAAATTG 3 54 0 

TATAAAGTAC TCATACAGAT GATATAAATT AAAGCTCTCT TCATAATCAT GTTGTAGTTT 36 00 

20 

TTGTTGACAT GATGAAGAGA GTTTTTTTGT GAATAAAAAA ATGACCAAGT TACCGGTCAT 36 6 0 

ATATGTAAAA AATGTGCGAT TTACTAAAAT AAAAATTATT CAGGAATGGT ACAAATTCTC 3720 

TGAGGCATAT AAATGCGTTA TAGTTGCTAT TCTCAATTAT GTTCGCGATA ATTTTAAGTA 3 7 80 

25 

AAAGTAAGCA CAGATATTGA ATTTGATAGG AGTTAATTGA ATGTATCATA ACAGTAACGC 3 84 0 

AAACTTTGTC pj^qq^^qp^ CTTTAAATGT GAGAGATAAG AATGAATTAA AGCCATTTTA 3 900 

30 TGAGGACATA TTAGGATTAA ATATTATAAA TGAGACATTA ACATCGATAC AATATGAAGT 3 960 

AGGTCAAAAT AATCATGTCA TTACACTTGT TGAATTACAA AATGGACGTG AACCTTTAAT 4 0 20 

GTCCGAAGCG GGACTGTTTC ATATCGCAAT TAAACTACCT CAAATTAGTG ATTTAGCTAA 4 0 80 

35 TTTACTAATT CATTTAAGCG AATATGATAT TCCAGTTAAC GGAGGTATAC AGCCTGCTTC 414 0 

GTTATCATTA TTTTTTGAAG ACCCGGAAGG AAACGGTTTT AAATTTTATG TTGATAAAGA 4 2 00 

CGAAGCGCAA TGGACGAGGC AAAATAATTT AGTAAAAATT GATATTAGAC CATTAAATGT 4 260 

40 

ACCGAGATTA GTGAGTCATG CAACAAAATT GTTATGGTTA GGTATTCCAG ATGACGCTAT 4320 

TATAGGTGCA TTGCATATTA AGACAATTCA TTTATCAGAG GTAAAAGAGT ACTACCTCGA 43 80 

45 TTATTTTGGA TTAGAGCAAT CGGCATATAT GGATGATTAT TCAATATTTT TAGCATCGAA 44 4 0 

TGGCTATTAT CAACATTTGG CCATGAATGA TTGGGTATCA GCAACGAAAC GTGTAGAAAA 4 500 

TTTTGATACG TATGGATTAG CAATTGTTGA CTTTCATTAT CCTGAAACAA CACATTTAAA 4 56 0 

50 TTTACAAGGT CCGGATGGTA TCTATTATCG CTTTAATCAT ATCGAAGTTG AAGATTAGTA 4 62 0 

TATACTTTGA ATGGACGAAC CATATAATGA ATCGTTTTTA ATGATCTTTT TATACAAGTT 4 6 80 

ATGAAGGAGG CTGGGACATT AAGTT CTTAG GCAATGTAAA AAGCTGATTT CTATTAATTA 4 74 0 
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w 



20 



25 



30 



40 



45 



50 



TTTTCCTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 4 860 

CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4 92 0 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



is (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 6 0 

CGCTTTTGCA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 120 

TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 24 0 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 300 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATCGTT 420 

ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAATCTA 4 80 

TAGGAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 54 0 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 600 

35 TCACTAGAGG AACGCGTACA TCGTTT 62 6 
(2) INFORMATION FOR SEQ ID NO: 28: 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDfcDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 28; 

nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 6 0 

AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 12 0 

GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 180 
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AACCTTAGCC AAGACGTTGA ATGTACCATT TGCAATTGCA GATGCGACAA GTTTAACTGA 360 

AGCTGGTTAT GTAGGCGATG ATGTTGAAAA TATCTTGTTG AGATTAATTC AAGCAGCTGA 420 

5 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA GATGAAATTG ATAAAATTGC 4 80 

ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA GGTGAAGGTG TTCAACAAGC 54 0 

w ATTGCTTAAA ATCTTAGAAG GTACGACTGC AAGTGTTCCG CCACAAGGTG GACGCAAACA 6 00 

TCCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC TTATTTATTC TTGGTGGTGC 66 0 

CTTTGATGGT ATTGAAGAAG TGATTAAGCG CCGTCTTGGT GAAAAAGTTA TTGGTTTCTC 720 

15 AAGCAATGAA GCTGATAAAT ATGACGAACA AGCATTATTA GCACAAATTC GCCCAGAAGA 780 

TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGACGT GTGCCAATTG TAG CTAATTT 84 0 

AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG CAACCTAAAA ATGCACTTGT 900 

20 

GAAACAATAT ACTAAAATGC TGGAATTAGA TGATGTGGAT TTAGAGTTCA CTGAAGAAGC 960 

TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA GGTGCGCGTG GTTTACGTTC 102 0 

AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG CCTTCTAACG AAAATGTAAC 108 0 

25 

GAaGGTAGTT ATTACAGCAC AAACmATTAA TGrAGaACTG AACCAG 1126 
(2) INFORMATION FOR SEQ ID NO: 29: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATTGACTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 6 0 

40 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 12 0 

GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 18 0 

AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 24 0 

45 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAACC AACTGAAGAA 30 0 

GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGG CGAAAGC 36 0 

so AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 42 0 

TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 54 0 
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20 



25 



CTTACTAAGC 


TAAAGAATAA 


TGATAATTGA 


TGGCAATGGC 


GGAAAATGGA 


TGTTGTCATT 


660 


ATAATAATAA 


ATGAAACAAT 


TATGTTGGAG 


GTAAACACGC 


ATGAAATGTA 


TTGTAGGTCT 


720 


AGGTAATATA 


GGTAAACGTT 


TTGAACTTAC 


AAGACATAAT 


ATCGGCTTTG 


AAGT CGTTG A 


780 


TTATATTTTA 


GAGAAAAATA 


ATTTTTCATT 


AGATAAACAA 


AAGTTTAAAG 


GTGCATATAC 


840 


AATTGAACGA 


ATGAACGGCG 


ATAAAGTGTT 


ATTTATCGAA 


CCAATGACAA 


TGATGAATTT 


900 


GTCAGGTGAA 


GCaGTTGCAC 


CGATTATGGA 


TTATTACAAT 


GTTAATCCAG 


AAGATTTAAT 


960 


TGTCTTATAT 


GATGATTTAG 


ATTTAGAACA 


AGGACAAGTT 


CGCTTAAGAC 


AAAAAGGAAG 


1020 


TGCGGGCGGT 


CACAATGGTA 


TGAAATCAAT 


TATTAAAATG 


CTTGGTACAG 


ACCAATTTAA 


1080 


ACGTATTCGT 


ATTGGTGTGG 


GAAGACCAAC 


GAATGGTATG 


ACGGTACCTG 


ATTATGTTTT 


1140 


ACAACGCTTT 


TCAAATGATG 


AAATGGTAAC 


GATGGAAAAA 


GTTATCGAAC 


ACGCAGCACG 


1200 


CtiCAATTGAA 


AAGTTTGTTG 


AAACATCACG 


ATTTGACCAT 


GTTATGAATG 


AATTTAATGG 


1260 


TGAAGTGAAA 


TAATGACAAT 


ATTGACAACG 


CTTATAAAAG 


AAGATAATCA 


TTTTCAAGAC 


1320 


CTTAATCAGG 


TATTTGGACA AGCAAACACA 


CTAGTAACTG 


GTCTTTCCCC 


GTCAGCTAAA 


1380 


GTGACGATGA 


TTGCTGAAAA 


ATATGCACAA 


AGTAATCAAC 


AGTTATTATT 


AATTAC CAAT 


1440 


AATTTATACC 


AAGCAGATAA 


ATTAGAAACA 


GATTTACTTC 


AATTTATAGA 


TGCTGAAGAA 


1500 


TTGTATAAGT 


ATCCTGTGCA 


AGATATTATG 


ACCGAAGAGT 


TTTCAACACA 


AAGCCCTCAA 


1560 


CTGATGAGTG 


AACGTATTAG 


AACTTTAACT 


GCGTTAGCTC 


AAGGTAAGAA 


AGGGTTATTT 


1620 


ATCGTTCCTT 


TAAATGGTTT 


GAAAAAGTGG 


TTAACTCCTG 


TTGAAATGTG 


GCAAAATCAC 


1680 


CAAATGACAT 


TGCGTGTTGG 


TGAGGATATC 


GATGTGGACC 


AATTTCTTAA 


CAAATTAGTT 


1740 


AATATGGGGT 


ACAAACGGGA 


ATCCGTGGTA 


TCGCATATTG 


GTGAATTCTC 


ATTGCGAGGA 


1800 


GGTATTATCG 


ATATCTTTCC 


GCTAATTGGG 


GAACcAATCA 


GAATTGAGCT 


ATTTGATACC 


1860 


GAAATTGATT 


CTATTCGGGA 


TTTTGATGTT 


GAAACGCAGC 


GTTCCAAAGA 


TAATGTTGAA 


1920 


GAAGTCGATA 


TCACAACTGC 


AAGTGATTAT 


ATCATTACTG 


AAGAAGTGAT 


CAGCCATCTT 


1980 


AAAGAAGAGT 


TAAAAACTGC 


ATATGAAAAT 


ACAAGACCCA 


AAATAGATAA 


ATCAGTGCGC 


2040 


AATGATTTGA 


AAGAAACGTA 


TGAAAGCTTT 


AAATT ATT CG 


AAAGTACATA 


CTTTGATCAT 


2100 


CAAATACTAC 


GTCGCTTAGT 


AGCGTTTATG 


TATGAAACAC 


CTTCGACAAT 


TATTGAGTAT 


2160 


TTCCAAAAAG 


ATGCAATCAT 


TGCAGTTGAT 


GAATTTAATC 


GTATTAAAGA 


AACTGAAGAA 


2220 




TAGAGTCTGA 


TTCGTTTATT 


AG CAAT ATT A 


TTGAAAGTGG 


TAATGGATTT 


2280 
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TCATGTAAAC CTGTCCAACA ATTTTATGGG CAATATGACA TTATGCGTTC TGAATTTCAA 24 6 0 

CGATATGTTA ATCAAAACTA TCATATCGTG GTTTTGGTCG AAAC CGAAAC TAAAGTTGAA 2 52 0 

5 

CGTATGCAAG CGATGTTAAG TGAAAtGCAT ATTCCATCAA TAACAAAATT GCATCGCTCA 25 BO 

ATGTCATCGG GGCAAGCAGT GATTATTGAA GGCAGTTTAT CTGAAGGATT TGAACTACCT 2 64 0 

GATATGGGAT TAGTTGTCAT TACTGAGCGT GAgcTTTTTA AATCAAAACA GAAAAAGCAA 270 0 

CGAAAACGTA CGAAAGCTAT CTCAAATGCT GAAAAAATTA AGTCTTACCA AGATTTAAAT 2760 

GTGGGAGATT ATATTGTTCA TGTGCATCAT GGTGTTGGTA GATATTTAGG TGTTGAGACG 2 82 0 

75 CTCGAAGTGG GGCAAACGCA TCGTGATTAT ATTAAATTGC AATATAAAGG TACGGATCAA 28 80 

CTATTTGTTC CAGTAGATCA AATGGATCAA GTTCAAAAAT ATGTAGCTTC GGAAGATAAG 294 0 

ACGCCAAAAT TAAATAAACT CGGTGGCAGT GAATGGAAAA AAACAAAAGC TAAAGTTCAA 3 000 

20 

CAAAGTGTTG AAGATATTGC TGAAGAGTTG ATTGATTTAT ATAAAGAAAG AGAAATGGCA 3060 

GAAGGTTATC AATATGGGGA AGACACAGCT GAGCAAACAA CATTTGAATT AGATTTTCCA 3120 

TATGAACTTA CGCCTGACCA AGCTAAATCT ATCGATGAAA TTAAAGATGA CATGCAAAAA 3180 

25 

TCGCGTCCAA TGGATCGCTT GCTATGTGGT GATGTTGGTT ATGGTAAAAC TGAAGTTGCA 324 0 

GTGAGAGCAG CATTCAAAGC TGTAATGGAA GGAAAGCAGG TTGCATTTTT AGTTCCTACA 3 3 00 

30 AC TATTTTAG CTCAGCAACA TTATGAGACG TTAATTGAGC GTATGCAAGA TTTTCCTGTT 3 3 60 

GAAATTCAAT TAATGAGTCG TTTTAGAACG CCTAAAGAGA TAAAACAAAC TAAGGAAGGA 342 0 

CTTAAAACTG GATTTGTTGA CATAGTTGTT GGTACACACA AATTACTTAG TAAAGATATA 34 80 

35 CAGTATAAAG ATTTAGGGCT GTTGATTGTA GATGAAGAAC AACGATTTGG TGTACGCCAT 3 54 0 

AAAGAGCGTA TTAAAACATT AAAACATAAT GTAGATGTAC TAACATTGAC TGCAACCCCA 3 600 

ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG ATTTGT CAGT GATTGAAACG 3 660 

40 

CCGCCAGAAA ATCGTTTCCC AGTTCAAACA TATGTATTAG AACAGAACAT GAGTTTTAT C 3 72 0 

AAAGAAGCTT TAGAAAGAGA ACTATCCCGT GATGGCCAAG TGTTTTATCT TTATAATAAA 3 780 

GTGCAATCCA TTTATGaAAA ACGAGAACAA CTCCAGATGT TAATGCCAGA TGCTAACATT 3 84 0 

45 

GCAGTTGCTC ATGGACAAAT GACAGAGCGC GATTTAGAAG AAACGATGTT AAGTTTTATC 3 900 

AATAATgAAT ATGATATTTT AGTAACGACG ACGATTATTG AAACAGGTGT CGATGTCCCA 3 960 

50 AATGCAAATA CTTTGATCAT TGAAGATGCA GATCGCTTTG GATTGAGTCA GTTGTATCAA 4 02 0 

TTAAGAGGTC GTGTTGGTCG TTCAAGTCGT ATTGGTTATG CATACTTCTT ACATCCAGCA 4 08 0 

AATAAGGTAC TAACTGAGAC TGCAGAAGAT CGATTACAAG CGATTAAAGA ATTTACGGAG 414 0 
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10 



20 



25 



30 



40 



TTAGGTAAAC AACAGCACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 4 260 

TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 4 320 

GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 4 380 

GCTAAAATTG AA 4392 
(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
15 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 6 0 

GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 12 0 

TTATCTTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 18 0 

TTTCATTCAT TAACGGATAT TGATCCCACA CATAATCTAA AATCGTTTTA CTAGATTTAA 24 0 

ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATATCGC 300 

CATTAAGCGC TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCGATTTT CCAATACCAT 3 60 

TTGGCCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 42 0 

TTGCAGTTTG ATAACCGATT TCTAAATTTT TTACATGCAT GACGTCATTA CCTGTATTCC 4 80 

35 GGTCAAAGCC AAATTGAATA TTTGCACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 54 0 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 600 

TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTCCTCGTAA TTACCAACAT 72 0 
AGCGTTTGA 

(2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



729 



EP0 786 519 A2 



TGATGTTTCG ATACATTTGT TGCACCTTGT GGATATACTT TAAAGGTTGT GTCGTATGTT 12 0 

TCCTTACTAT CTTTAGCTTC AGATTCCTGT GATTCAACCG TTTTATATTT TTCAAGTGCA 18 0 

5 

TGTCCTTCAA TATCAACTCG TGGAATAATG CGATTCAACC ATGCTGGTAA ATACCACGAA 24 0 

CCTTTtCCAA ACAATTTCGt TAATGCAGGA ATTAACATCA TtCTGACTAC GAAGGCATCA 300 

AAGAGTACAC CAAACGCTAA TGCCATACCC ATTGATTTAA TCATGACATC TTCTTGGAAT 3 60 

10 

ACAAACGCAA AGAAGACACT AAACATAATT AATGCAGCTG CTACAATAAC AGGACCGCTT 42 0 

TCTTTCAATC CTACTTTGAT AGAATAATCA TTATCCCCTG TTTTACTATm yyCTTCATGr 480 

15 ATTCGCGACA TAAGGAAGAC TTCATAATCC ATCGCTAATC CAAATAAGAT ACCTATAGTA 54 0 

ATAACCGGTA AAAATGCTAG CATTGGTCCT GTCGTTTCAA TACCAAACAG ACCTTTCATA 60 0 

AAACCATCTT GCATTACTAA TGTTGTAAAT CCTAATGTTG CCATTAATGA CAAGACGAAT 66 0 

on 

CCTAAAACTG CTTTTAATGG TATTAGAATT GAACGGAAGA CAATCATTAA TAAGAAAAAT 72 0 

GCTAATACAA CAATGACTGA GGCAAATAAA GGTATCGCCT CATTTAACTT TTTAGACATA 780 

TCAATATTAA TGACACTTTG TCCCGAAATC TCCGTTTTGA ACCCATATTT ATCTTGTGCA 84 0 

25 

TCTTTATGAT AATCTCGTAA ATCATGCACT AAATCATTTG TACTCTCTGC ATTAGGCCCT 900 

TGCTTAGGTA TCACGACCAT CAAAGCGTAA TCATTATCTT TACTCATTTG TGGTGGCGTA 96 0 

ACGATATCTA CATTTTTCTT ATCTTTAATA TCTTTATATA CAGACTGTAA ATCTTGTTGT 102 0 

30 

AATCCTTGTG GATCATCCTT TTTATCTTTC ACATTTATCA ACATCGGTAT TTGGCCATTA 10 8 0 

AATCCTTCAC CAAATTTATC CGAGATAATA TCGTAAGCTT TTTTCTGTGT AGAATCTGCT 114 0 

35 GGTTTAACAC CGTCATCTGG AATACCAAGT CGCATATGAC TAACTGGTAT TGCAGCTGCT 12 0 0 

ACTAATATGA TTAAACCTAG TAATACTGCC GCAAGTGCAT TTCCTGTAAT AAATTTAGAC 126 0 

CATGGCGTAT CAATATCTTT TTTGAATTTA GACTGTAATT TATTCACTTT AATGCGTTtA 132 0 

40 

TGGAAAATGC TTATTAATGC AGGTAATAAA GTTAAAGCGC TAAGTACTGC AAAAACAACA 13 8 0 

CTAATTGCCG AAGCAAATCC CATTACCGCT AAGAAGTCAA TGCCTACTAA TGATAAACCA 144 0 

CATACTGCAA TTACAACTGT TACACCAGCA AAAACAACTG CACTACCTGC TGTTCCTATT 150 0 

45 

GCAAGACCAA TGCCTTTAAT GTAATCTGTT TCAGTTTTCA TAACTTGTCG ATATCTGAAT 156 0 

AAAATAAATA ATGCATAATC GATACCAACT GCTAGTCCAA TCATTACGGC TAATGTCAGT 162 0 

50 GTGACATTTG GTATATCGAA TGCATAAGTT AACAAACTGA TAATACCTAC ACCAGAGGCT 168 0 

AGACCAATCA ATGCACTTAT AATTGGTAAT CCTGCAGCAA TGACTGAACC GAATGTGATT 174 0 

AACAGTACAA CAAATGCAAC AATAATACCA ACTAGTTCAG AATTACCGCC TACTTCTGTA 180 0 
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AAATGACTTT 


TAACATTATC 


TCTAGAGCCA 


TCTTTTAAAG 


ATGTTTGACT 


AACGTCATAT 


1920 




GTGATATCTG 


CAAATGCAGT 


TGTTTTATCT 


TTACTAATTT 


GCTTATTTTC 


ATAAGGATCT 


1980 


5 


GATATTTTAT 


CAATGTGCTT 


GTCATCTTTT 


TTAATATCAT 


CTAACGTTTT 


CTTAATATCT 


2040 




TTAGTAATGT 


TCGGTTGCAC 


AATACCATCA 


TCTTTAGTCG 


TCTTAAAGAC 


AACACGTATT 


2100 


TO 


TGTGCCTTTT 


CACTATCTTG 


ATTAAAATGT 


TTTTCAATCT 


TTTTATTCGT 


ATCTAACGAC 


2160 


TCTAATCCTG 


TCATTTTAAT 


ATCATTGTCA 


AATTTCGGTG 


CATTTGTAGC 


AAGTGGTATC 


2220 




AATATTGCAG 


CTACAATCAC 


TATCCATGCA 


ATGACCGCGG 


ACCATTTATG 


TTTTGCGATG 


2280 


15 


AATGTCCCCA 


TCTTATATAA 


AAATTTTGCC 


AAAGTATATT 


GCCTCCTTTT 


AAAATCAACG 


2340 




TTATAGTTTA 


AATATACAGT 


GTAGATTATT 


GTTCGATTAT 


AGTATCTATC 


CCCGACCTCT 


2400 




TAAAGAATCA ATTGGAAAAT 


TTTGTATATT 


AAACTACACA 


CAAAGGAGAA 


ATGTAGATGA 


2460 


20 


AAGAGACTGA 


i t «i' ^ f^ri & ri'i~ i ■ 
a, x ±* »>— — — 


ATAAAGACAA 


AAAAAGCATT 


GTCGAGTAGC 


TTGCTACAAT 


2520 




TGTTAGAACA 


GCAATTATTC 


CAAACGATTA 


CTGTCAATCA 


AATTTGCGAC 


AACGCACTCG 


2580 


25 


TACACCGTAC 


AACATTTTAT 


AAACATTTTT 


ATGATAAATA 


TGATCTTCTA 


GAGTACTTGT 


2640 


TCAATCAATT 


GACTAAAQjAL 


TACTTTGCTA 


GAGATATCAG 


TGACCGTCTT 


AAT CATC CAT 


2700 




TCCAAACGAT 


GAGTGATACG 


ATTAATAATA 


AAGAGGATTT 


GAGAGAAATC 


GCAGAATTCC 


2760 


30 


AAGAAGAAGA 


CGCTGAATTT 


AATAAAGTAT 


TAAAAAATGT 


CTGCATTAAA 


ATTATGCATA 


2820 




ACGATATCAA 


AAATAATAGA 


GAG CGTAT CG 


ATATTGACAG 


CGACATCCCA 


GATAATCTCA 


2880 




TATTTTATAT 


TTATGACTCG 


TTGATTGAAG 


GTTTTATACA 


TTGGATAAAA 


GATGAAAAAA 


2940 


35 


TTGATTGGCC 


TGGCGAAGAT 


ATTGATAACA 


TTTTCCATAG 


ATTAATCAAT 


ATTAAGATTA 


3000 




AATAGTAGAT 


GAGAAACTCA 


TGAGCGTTAC 


CAACATTCAT 


AATAAAAACG 


ATAGTGkACA 


3060 




CGTTAATGAA 


TTCGTGTACT 


ACTATCGTTT 


TTTATTTTTA 


TCGTGCTTAT 


CGCTATTAAA 


3120 


40 


ACAACTGATA 


CACAACACAT 


AAACTATGAA 


GAAAAAAATA 


AATCCGCTAT 


CTAAATGACT 


3180 




TTGACTCAGT 


TGTTTAAATG 


ACCAAATTGC 


TAATACAATT 


CCCATTATTA 


TTGAAATAAC 


324 0 


45 


GTATCTCACA 


TTCTTATACC 


TATAATCCTT 


TTCTAAAAAT 


ATGGTTGCTA 


TTACTTAATT 


3300 




TTTAAAGTTA 


TAAATAAAAA 


GAGCCAACCG 


CAATGGATGG 


CCCTTGTTCA 


TTATGAAGCA 


3360 




TTAGAACATT 


TCTGAAACAA 


CCTTTTGTTC 


TAAGAAGTGT 


AATAAGTAGT 


CTGGACTACC 


3420 


50 


TGTTTTAGCG 


TCCGTACCTG 


ACATTTTGAA 


ACCACCAAAT 


GGATGGTATC 


CAACAACTGC 


3480 






CCTCTGTTAA 


GGTATAAATT 


GCCTACATCA 


AATTCGTTTA 


CCGCTTTAAT 


3540 



TTCTTCTTGC ATGATTCTAT CTTTAGATTT 
GTAACCTTTT GAATCATCAG TGCCGCCACC 

5 

CTCAATATAA TTTTTAATCT TATCAAATTG 
ATTGTCTACA GTATTGCCCA ACGTTAATTC 
TTCGTCATAA ACGTCTTTAT GCACAATTGC 

w 

AAAACCAAAT GCTGACGTTA CAATAGCTTC 
AACTACAATG GCATCTTTAC CACCCATTTC 

15 TTCTTGAACA ACGGCACTAC GTTCATAAAT 

TGTAACGAAA TGCGTATCTT TATGATCAAC 
AGGAACAAAG TTAACTACGC CTTTTGGTAA 

20 ATAAGCGATA TAAGGTGTAT CCTCAGCAGG 

TGGTGCTAAA GTTGTACCAG CCATAATCGC 
ACCTGTACCA ATTGATTTAT AGAAATATTT 

25 

CTTA.CCTTGA GCCAAGTCCA TCATTGAACG 
AGCTGCATCA CCAACTGCTT CATCCCATGG 
AATTTCCGCT TTTCGACGAC GAATAATTGC 

30 

ATTTGCTGAC CATGTTTTCC AAGATTTATA 
AACATCTTGT TTTGTTGCCT TTGATGCATT 

35 GATTGATTTA ATTTTGTCAT CTTTGAAAAT 

TTGACCTAAT TCTTTTTCCA CGT CTTTCAA 
GACTGAAAAA TCGTAACCAG GTTCATTTTT 

40 ATAAATTTTG AAAGTGGTTT AACCCTTTGA 

TTACTATGAT TAAGGTTAGT TTTGCAATCG 
CAAGTATTTT GAAATTGATT GGTTACTTTT 

45 

TATCGTTTCG TCATTTAATG TTT CGGATGG 
ACAAGGGTTT CCAACCGCTA AGCTGTGTGG 
50 AC CAATCACA CTGCCTTCTC CAATCGTCAC 

CCAAGTATTA CTGCCAATAT GAATGGGTCC 
GAAATTAAGT GGATGTGTCG CTGTGTAGAA 
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AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3 72 0 

TTGTTCTAAT TTACCTTCTT CTTTACCAAT 378 0 

TTTTTTATTA ATAACTGGGC CCATATACGT 3 84 0 

TTTTGTTAAT TTGATTGATT TCTCTAATAC 3 900 

ACGTGAACAT GCTGAACATT TTTGACCAGA 3 96 0 

TGCTGCCATA TCTGTATCAA TATTTTCATC 4 020 

AGCGATAACA CGTTTCAAGA AGTTTTGACC 4 080 

TCTAGTACCT GTCGCACGTG ATCCTGTAAA 4140 

TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

TCCTGCTTCT TCTAAAATTT CCATTAATTT 4260 

TTTCAATAAC ACTGTATTAC CTGCCACAAC 4320 

AAACGGGAAG TTCCACGGCG GAATTGTAAC 4380 

ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

TGCATAGTAT TCAATAAAAT CAATACCTTC 4 50 0 

CTTACCTGCT TCATAAACCA TAATTGCTGC 456 0 

CGAAACACGT AACATAAGCT CTGCACGATC 4620 

AGCTTCGTTT GCTGCTTTAA ACGCATCTTC 4 6 80 

TGCAATCACT TGTGATGTGT CTGCAGGATT 4 74 0 

CTTCTCTCCA TTAATCACTA ATGGTATGTC 4 8 00 

TGCTTTCTTA AA CAT AT C CA CATTTTCTTG 4 860 

AAATTCTACT ACCATGTACA CTTACCCCCT 4 920 

TTTAATGATA TAACATCATT TAAACTCATT 4 980 

CTTTCATTTT TATGTTTTAT CACTTATTCT 5 04 0 

TAAAATTTAT ATGGGTCGCA ACTGCTACTT 5100 

TAGGTCATTA TCAATTTTAC GAACGACTTT 5160 

CGGAATATCT TTAGTGACAA CACTACCAGC 522 0 

CCCTGGTAAC ACGGCTACAT GACCGCCAAA 52 8 0 

GGCTTTTTCA AAACCTTCAT TTCTATGATG 53 4 0 

TCCACAATTA GGTCCTATAA AAACATTATC 54 00 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT ATCAAAAGGA AT CG AAATAC TTACATTGTC 552 0 

TGT7GTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 55 8 0 

5 

TGTATGATTT AATTC AAA GC AAATATCTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 564 0 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 57 0 0 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 57 60 

w 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 5820 

TCGCATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 5880 

is AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 594 0 

ATTAATTACA GGCACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 6060 

20 

CATC CAT CTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 612 0 

TTAAGTCACC TAAGAATTGC AAATCCAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 6180 

TCCTTTAATA TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 624 0 

25 

ATGTTTGCAC GGCAATCTCT CTTTTTCTTT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 63 00 

TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 6360 

30 AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 64 2 0 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 64 80 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 654 0 

35 AATACGATTC AACATTTAGG TTACGGTGTC GCTGTAGAAA CTGTCGAATT AGACATTACA 66 00 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 6660 

GTTCAAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATC CTGAA 6720 

40 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 67 80 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 6 84 0 

AAGCTTATCA TATCAGCAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 6 90 0 

45 

TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA 6 96 0 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 702 0 

so GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 70 80 

a^TTATOAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 714 0 
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TTAAAAGATG 


GTAATGAAGT 


GATGATTCCT 


CTAAATGAAG 


TACATGTTGG 


AGATACACTT 


7320 


ATCGTTAAAC 


CAGGTGAAAA 


GATACCTGTT 


GATGGCAAAA 


TTATTAAAGG 


TATGACTGCC 


7380 


ATCGACGAAT 


CTATGTTAAC 


AGGTGAATCT 


ATCCCTGTTG 


AGAAGAATGT 


TGATGATACT 


7440 


GTAATTGGTT 


CAACGATGAA 


CAAAAACGGT 


ACTATTACTA 


TGACAGCAAC 


AAAAGTTGGC 


7500 


GGGGACACTG 


CGTTGGCAAA 


TATTATTAAA 


GTTGTCGAAG 


AAGCTCAAAG 


TTCTAAAGCG 


7560 


CCGATTCAAC 


GATTGGCAGA 


TATTATTTCT 


GGTTATTTCG 


TTCCTATCGT 


TGTTGGTATC 


7620 


GCACTATTAA 


CATTTATCGT 


GTGGATTACT 


TTAGTTACAC 


CAGGTACATT 


TGAACCTGCA 


7680 


CTTGTTGCGA 


GTATTTCCGT 


TCTCGTCATT 


GCTTGTCCAT 


GCGCATTGGG 


ACTTGCTACA 


7740 


CCAACTTCTA 


TTATGGTAGG 


TACTGGTCGC 


GCTGCTGaAA 


ATGGTATTTT 


ATTTAAAGGT 


7800 


GGCGAGTTTG 


TTGAACGCAC 


ACATCAAATT 


GATACCATCG 


TTTTAGATAA 


GACGGGTACC 


7860 


ATTACAAATG 


GTCGTCCAGT 


CGTGACAGAT 


TATCATGGTG 


ACAATCAAAC 


GCTACAACTA 


7920 


CTTGCTACTG 


CTGAAAAAGA 


TTCTGAACAC 


CCATTGGCAG 


AAGCCATTGT 


CAATTATGCA 


7980 


AAAGAAAAGC 


AATTAATATT 


AACTGAGACA 


ACAACATTTA 


AAGCAGTACC 


TGGCCATGGT 


8040 


ATTGAAGCAA 


CGATTGATCA 


TCACCATATA 


TTGGTTGGTA 


ACCGTAAATT 


AATGGCTGAC 


8100 


AATGATATTA 


GCTTGCCTAA 


GCATATTTCT 


GATGATTTAA 


CACATTATGA 


ACGAGATGGT 


8160 


AAAACTG CTA 


TGCTCATTGC 


TGTTAATTAT 


TCATTAACTG 


GTATCATCGC 


AGTGGCAGAT 


8220 


ACTGTCAAAG 


ATCATGCCAA 


AGATGCTATA 


AAACAATTGC 


ATGATATGGG 


CATTGAAGTT 


8280 


GCCATGTTAA 


CTGGCGATAA 


TAAAAACACT 


GCTCAAGCCA 


TTGCAAAACA 


AGTAGG CATA 


8340 


GATACTGTTA 


TTGCAGATAT 


TTTACCAGAA 


GAAAAAGCTG 


CACAAATTGC 


GAAACTACAG 


8400 


CAACAAGGTA 


AGAAGGTTGC 


GATGGTTGGT 


GACGG TGTAA 


ATGATGCACC 


TGCATTAGTT 


8460 


AAAGCTGATA 


TCGGTATCGC 


CATTGGTACA 


GGTACAGAAG 


TTGCCATTGA 


AGCAGCTGAT 


8520 


ATTACTATTC 


TTGGTGGCGA 


CTTGATGCTT 


ATTCCTAAAG 


CCATTTATGC 


AAGTAAAGCA 


8580 


ACCATTCGTA 


ATATTCGTCA 


AAATCTATTT 


TGGGCATTCG 


GCTATAATAT 


TGCCGGTATC 


8640 


CCTATAGCTG 


CATTGGGCTT 


ACTTGCGCCA 


TGGGTTGCTG 


GTGCTGCAAT 


GGCACTAAGT 


8700 


TCAGTAAGTG 


TTGTCACAAA 


CGCACTTAGA 


TTGAAAAAGA 


TGCGATTAGA 


ACCACGCCGT 


8760 


AAAGATGCCT 


AGATTCCTTA 


ATAATGAAGG 


ATTCGTTGGT 


GATTCTGAGA 


TAGGCTAGTG 


8820 


ATTGGCTCTA 


TAATGTCGCG 


GTTTAyaGTt 


GGATCTTCGC 


TCCAACTGCA 


TATATAGTnA 


8880 


CACTTTTCGC 


TTGGCGAATT 


AGTGTATCTT 


ACCTAATAGc 


TCCGCCTATT 


AGGTTCCATC 


8940 


ATTATTATAA 


ATAATAAGTA 


CACTACGGtT 


TACAGTTGGA 


TCTTCGCTCC 


AACTGCATAA 


9000 
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GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 912 0 

TTAAATAATA TTGACGGTGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 9180 

5 GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACGCAAT TGAAGATCAA 924 0 

GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 93 0 0 

AACTGATGAG AATCCCAACA ATC CAAATTA TCTCATCAGT TCGATTTTTA ATTTACTCGT 93 6 0 

w 

AACCTAGTAT CTCCAGTCTG CAATACATCT AATGTTGCAT CTAATGCATC GACAATTAGA 94 2 0 

TTTTTAACTG CAGCTTCAGT ATAAAACGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 94 8 0 

75 TCAATCAACG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 954 0 

AGTTTGCGTT CAAATTCATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 960 0 

GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC AAATACTGCG 9660 

20 CCCTTTTTAA AATGTTTAAA TAATTCAGCA TTAAATAGAT AATGATTATA TTTCGTTGCA 9720 

GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCGCTT CCTCAATCGT ATCTTTGTAA 97 8 0 

TCGACATACG TTGCAATTTT AG CATT AGG A AACGGtCGTA TGCGACCACA TCACTTTGAT 9840 

25 

AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 990 0 

CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 9 96 0 

GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 1002 0 

30 

ACTCCGCAAT TGAATTCGGA GAGTATGACG GCACATTTGA CACAATAAAG TTATACTTGT 1008 0 

TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT TGTTTAATAC 1014 0 

35 C TAG TT CATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 1020 0 

AGCCATCATA ACCAGCGACA CCTTCAACAT TGTCATCAGT TAATGCTTCT TTAGTAATAT 1026 0 

CTACCTCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGG CAT AT CT TCATCACGTA 10320 

40 CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TTCATATAAA 10380 

TATGCTAGTT CTGTTAATCT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAACGCCAAT 10440 

GAAATTCTCA CATAACGATT ACCATTCTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 10 500 

GACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 10 56 0 

AACCATACAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 1062 0 

so GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 10 6 80 

TAATGATTCA AAGCATATAT TGCGGCATCT TGTAATGCAC CAAACATCCC AG CATTTGTG 10 740 
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CCATTTTCCG AAGCAAGTAT ACTAGGATTT TTAGCGTCGA AACCGAAAGC ACCATAAGCA 10 920 

AAATCATGCA CGATTTTAGT GTCTGTACCT TTAAATTTAG cTATCGCTTC ATCAAAAACT 10 9 80 

5 

TCTTTCGTAG CTGTCGATCC AGTTGGATTA TTTGGATACG TTAAATAAAT GAGTTTTGTT 11040 

TTATCTATTA TTTGTGAATC AACTTTGGAC CAATCTGGCA AATAATGTGG CGGTTCTAAA 1110 0 

TTAAGCGGGA CTGGCTTGCC ATCAGCTAAA AGTACACCTG CTAAATAATC CGTGTAGCCT 1116 0 

10 

GGATCAGGTA GTAATACATA GTCTCCTGGA TTGATAACAC ATGTTGGTAC TGCCACTAAT 11220 

CCATTTTTTG TAC CATATAA AATGCATACT TCATCTTCTT TATCTAACGT CACATTATAT 112 8 0 

15 TGTCTTTGAT AAAAATCTAC AATAGCTTGC TTGAACGCTT CTTTACCATG AAAAGCACCA 11340 

TATTTTTGAT TTTCAGGAAT AGTTAGTGCT TTTTGAAAAT GATCAATAAT ACCTTGTGGC 114 00 

GTGGGCCCAT CAGGGATTCC AACTGCCATA TTAATTAATG GCAATGGTCC ATGTTCGATT 114 6 0 

20 TTACGTCCCA TCGTTTTCCC GAAATAACTA TCAGGGATAT TTGCTAATTT GTTAGAGATC 11520 

ATCAAATTCC TCCTCTATCA TTAAACATAG CCTGGGCGAC TATCATAATC CTAACAACTT 115 8 0 

GTATCACTCT CATTTAGATG GTTACAATGA CATCGCCATT CACCGTTATG TTCAACAGAA 1164 0 

25 

CTTATGACAC ACGTTGTATT GAATGAATTT ATTTTCATTT TAGGTAGGTA TAATATTATT 117 00 

GTCAATATTA GGAATTTTCA GATTAATATG CACTCAATCG TTATGATTTA ACTGTCATGC 117 60 

ATATCCGCAT GCGCAACCAG TTAGATATGC TTATATAAAG TATAACGCCC ATCAAGGTAC 11820 

30 

GTATTCAAAC GTGAACCTTA ACAGGCGTCA TTCATTGTTA AATAAAACTT CTTAAGCACA 118 80 

TACTTATTTC ACTATGCCTT TTACGTTCCC CTTATACTTT TCTCACATCT TTCTCTTAGA 11940 

35 CTACTCCCTT ATACGCCCCG CTCAATATCT TTAATCATTT CATCTACAGT TATTTTCGCA 12 00 0 

CTCGTTAAGA CAATAGGAAC GCCTGCACCT GGATGCGTAC TTGCACCTGC AAAATATAAA 12 06 0 

TCTTTATAAT CTCGCGATAC ATTTTGTGGA CGATAATAAT TACTTTGCGC TAAAGTTGGC 1212 0 

40 

ATTAAACCGA ATGCCGAACC AAATTTCGCA TGATACGTTT GCTCAAAATC ATTTGGCGTA 1218 0 

AAGATTGTTT CTGAAACAAT ATG CGATTTT ATATCTTCAA ATACTTCAAT CGTTGCTAAT 12 24 0 

TTACGATAAA TAATTTCCTT TATTTGTTGC GTCAAAGCTT CATCTGACCA ATCGATTCCG 12 30 0 

45 

CTACCTGTTT TAAGTTCCGG CGTCGGCATT AG C A CAT AAA TACCAGTTTT GCCTTCTGGC 123 60 

GCAAGTGATT TATCAGCGAC CGCTGGTACA TACACATAAA TAGAAGGATC ATATGATAAA 124 20 

so CGTCCCTCAA ATATTTCTTC AATATTGCCT CTAAAGTCAT CTGAAAAAAT AACATTATGA 12480 

AGTCTCACTT GATCTGTCAC ATCAATATCT ATACCGATAT ACATTAAAAA TGCTGAACAA 12 540 

GAGTAATCTA AGTCTGCAAT TTTATGTGGT GGATACTTTT TAATAGGTGC AAAATCTGGC 12 6 00 
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ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12 720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12780 

TGAGCCATGC CAT A CAT AC C GCCTTTAATA AAATGCACAC CAAACATCAT TTCAATCATA 12 840 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12 900 

AACGCTAAAA GCTTTTGTAT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12 960 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG T CAT ATT AT A AAAGTCACTC 13 020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13 080 

TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 1314 0 

TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13 200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

20 AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13 3 20 

CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13380 

ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCGCTGCTA ATCCTGTGAC ACCTGCACCA 13440 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13 500 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13 560 

CCTGTCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC TGCTAATTCT ATGATTGGTT 13 620 

GTGCTTCAAT ACTAAATACT TTGATTTCAT C CAT AAC AT C TTGAAAATCT TTTTCTGCGA 13 680 

TAGCTGCATA ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13 740 

35 CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT AAAATATATC CGTTCATTGT 13800 

CAAAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13856 
(2) INFORMATION FOR SEQ ID NO: 32: 
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(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 10088 base pairs 
(B) TYPE: nucleic acid 
{ C ) STRANDEDNESS : doubl e 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32: 
ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 6 0 

AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AATAGCTGTA ATAGAATACT 12 0 



10 



15 



20 



25 



EP0 786 519 A2 

ATAATTGGTT AATATATGAG TAATTAGAAA ATAGACAAAG GATGACGATT TATGTATATC 3 00 

AATATGAAAG ATTATGGGTT AACAGGCATA AACAAAACTA AAGATACTCG AGCAATACAA 3 60 

CGTGCGTTAA ATCGTGGAAG ATGTAAACCA ACGACAGTTT ATATACCGAA AGGGACGTAT 420 

GATATTTGCA AACCATTAAC GATATATGGC AATACAACAC TTTTGTTAGA TAATGAAACT 4 80 

ATTTTACGCC GATGTCATTC TGGTCCTTTA TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 54 0 

CGTGGTTATA ATGGACACAG TCATATTCAT ATTAAAGG CG GCAAGTTTGA TATGAATGGT 6 00 

GTATCGTATC CTTATAACAA TACAGCTATG TGCATTGGGC ATGCTGAAGA TATTCAATTA 66 0 

ATAGGTGTGA CCATTAAGAA TGTAGTGAGT GGTCATGCAA TTGATGCTTG TGGGATTAAC 72 0 

GGACTCTATA TTAAAAGCTG TTCATTTGAA GGATTCATAG ACTATAGTGG CGAACcTTTT 780 

ATTCTGAAGC AATACAATTA GACATTCAAG TACCTGGTGC TTTTCCAAAA TTCGGAACgA 84 0 

CAGATGGTAC GATAACGAAA AATGTCATTA TCGAAGATTG TTATTTTGGA CCTTCAGAAT 90 0 

TGCCCGAAAT GGGAAGTTGG AATCGTGCTA TTGGCTCACA TGCAAGTAGA CATAATCGAT 96 0 

ACTATGAGAA TATTCATATT AGAAATAATA TATTTGAAGA TATACAAGGT TATGCATTAA 102 0 

CTCCCTTGaA GTATAAAGAT GCTTTCATTA TTAATAATAA GTTTATTAAC TGTGaGGGTG 1080 

GCATTAGATA TTTAGGAGTT AGAGATGGTA AAAATGCAGC AGATGTGaTG ACAGGaAAAG 114 0 

ACTTAGGTTC CCAAGCAGGC ATAAATATGA ATATAATTGG AAATGAATTT AAAGGATCAA 12 0 0 

TGTCTAAAGA TGCGATACAT GTACGTAATT ATAATAATGT TAAACATAAA GATGTATTAA 12 6 0 

TCGTTGGGAA TACATTCAAT AATTCGACTC AATCAATTCA TTTAGAAGAT ATTGATACAG 13 2 0 

35 TGTTTTTAAG TCCTGTTGAA GCGGGTATTC AAGTTACTAC AATCAATGTA GATGAAATAA 13 8 0 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA TTAAGAATAG TAGATAATTT TTGAAAGCGC 14 4 0 

ATTGATAAAA CGGTATAAAT ATGCTATAAT AAACCCAATT ATCTGATAAA AGGGGTATTT 15 0 0 

TGACGGTAAT GATAATACAA GATAGACAAC TTTCTATACT CTAATATAGT GAGTTGAAGT 156 0 

AGCTTGTCAT AATCATCATG AGGGGGAAAT TTATGGCTTA TTTCAATCAA CATCAATCAA 16 2 0 

TGATAT CGAA AAGGTATTTA ACATTCTTTT CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 16 8 0 

CGGGACAACT TATTGGACTA ATATTAGGTC CATTACTTTT CCTATTAACA TTATTATTCT 174 0 

TTCATCCACA AGACTTACCT TGGAAAGGCG TCTATGTTTT AGCGATTACT TTATGGATTG 18 00 

CGACTTGGTG GATTACTGAA GCAATTCCTA TTGCAGCAAC GAGCTTATTA CCAATTGTGT 18 60 

TATTACCATT AGGTCATATA CTTACACCAG AACAAGTATC ATCCGAATAT GGCAATGATA 192 0 

TTATCTTTTT GTTTTTAGGT GGATTTATTT TGGCAATTGC AATGGAAAGA TGGAATTTAC 1980 
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IT GG ATT CAT GGTGGCAACA GGATTCTTAT CTATGTTTGT ATCGAACACT GCAGCTGTAA 2100 

TGATTATGAT TCCGATTGGT TTAGCAATTA TTAAGGAAGC ACATGATTTA CAAGAAGCCA 2160 

5 ATACGAATCA AACAAGTATT CAAAAGTTTG AAAAATCTCT AGTTTTAGCA ATTGGCTATG 2220 

CAGGTACGAT TGGTGGCTTG GGTACATTAA TCGGAACCCC GCCATTAATT ATTTTAAAAG 22 80 

GACAATACAT GCAACATTTT GGACATGAAA TTAGTTTTGC TAAATGGATG ATTGTAGGGA 234 0 

10 

TTCCAACGGT CATTGTTTTG TTAGGTATTA CTTGGCTCTA TTTAAGATAT GTTGCGTTTA 24 00 

GACATGATTT GAAATATTTa CCTGGTGGTC AGACGTTAAT TAAACAAAAG TTAGACGAGC 24 60 

TTGGCAAAAT GAAGTATGAA GAAAAGGTAG TACAAACTAT CTTTGTACTT GCTAGCTTAT 2 520 

TATGGATTAC AAGAGAGTTT CTTCTGAAAA AATGGGAAGT TACGTCATCT GTTGCAGATG 2580 

GTACGATTGC TATTTTTATA TCAATATTAT TATTTATTAT TCCAGCTAAA AATACTGAAA 2 64 0 

20 AACATCGCCG TATCATTGAC TGGGAAGTTG CAAAAGAGCT CCCTTGGGGT GTATTAATTT 2700 

TATTTGGTGG CGGTTTAGCA TTAGCGAAAG OTATTTCTGA AAGTGGTTTA GCAAAATGGT 2760 

TAGGCGAACA GTTGAAATCA TTAAATGGTG TTAGTCCGAT TCTTATTGTA ATTGTCATAA 2820 

25 CAATCTTTGT CTTATTTTTA ACTGAAGTGA CATCTAATAC TGCAACTGCA ACGATGATTT 288 0 

TACCGATTTT AGCAACGTTG TCTGTTGCTG TTGGAGTGCA TCCATTACTA CTTATGGCAC 2 94 0 

CTGCAGCTAT GGCGGCTAAC TGTGCATACA TGTTACCAGT AGGGACACCA CCGAATGCAA 3 000 

30 

TTATCTTTGG TTCTGGTAAA ATATCTATCA AACAAATGGC ATCAGTAGGA TTCTGGGTAA 3 06 0 

ACTTAATCAG TGCAATAATT ATTATTTTAG TCGTGTATTA TGTAATGCCT ATAGTTTTAG 312 0 

J5 GTATTGATAT AAATCAACCA CTGCCATTGA AATAGTAATT GCAGATTAGA ACGAAAAATA 318 0 

AAAGGTTACA TTAGCAATTG CTTGGACGAG TGGTAACGAA ACGTATACCG CAGCATCGTG 324 0 

TAAfiAACAAT ACAAACAAAA GAAAGTCAAC CAAGGATGGA TTCCTATTTT AATCCTTGGT 33 0 0 

40 TGACTCTTTA TTTTATTTAA ATTGTAGAAC CTAGAAAATA AAGTTTAATT AAAAGCACCA 33 6 0 

ATCATTTCTA CTTTGAAATC TAAGGTTTCT AAAATAGCAA TGACTTTCTT TATATCGGTT 34 2 0 

GTAATTGCAG AATCAGCCTG AACGAAAAAT CGATACATAC CTAATTGTGT TTTTAAAGGA 34 9 0 

45 

CGAGACl'CAA TCCAGGATAA ATTAATATTA AACAAAGCAA ATGTATTAAG CACACTTGCT 3 54 0 

AACAACCCAG GTTTATCATG CATTGGTGTA ATTAAAAACA TCAATGATGT CGCATTTTGA 3 6 00 

TCAAATTGCT GCTGATTTTT TATAACTAAA AAACGTGTCA CGTTATGTGG ATAGTCTTCA 36 6 0 

so 

ATATGTGTAT CAATAGGTGT AAAACCATAA GctTCGCCAC TACCTAAAGG TGCAATTGCT 3 72 0 



EP0 786 519 A2 



TTTTTAATAT CAGAAATGGA ATCTGTTCCA TTACCATATA ATGCAAAGTT AATATCTAAA 3 900 

CGTATTTCAC CGTGTGCAAA GACATCTTGC TGTGCAAGTG CATCTGCCAC AATGTTGATT 3960 

5 

GTTCCTTCTA TAGAATTTTC AATAGGGACA ACACCAATCG ATGTGTCATC ATCTGCAACT 4 020 

GCCTTGATGA CTTCAAATAA ATTTGACTTT GGTTGAAAAG TTGCTTCATT TTCAGAAAAA 4 080 

TACTGACGAC AAGCCAAATA TGAAAATGTA CCTTTAGGGC CTAAATAATA TAATTGCATA 414 0 

10 

TGCTACACCT CTACTAACTT AATGATGGAA AGGGCACTGG TTAGCATTTG ATTCTTTCTT 4 200 

TTTATAGAAA AAGTTTGGAT CTTTTACTGT ATTGTCATAT CCGTGATGAT AATTTGACGT 4260 

15 CAATGTTGGA GATAATGGCG GTGCTAGCCA AGACCATTTT CCGGTAACTT GACGACCTTG 4320 

TTGTGCTTCG TTACGTTCGA ATAGTTCGAA TTGCTTTGCA GCGGTCAAAT GATCGACAAT 4 380 

TGATACGCCT TCTTTTTTAA AGGAATGATA CACAGCATAG TTCAATTCAA CAAGTG CTCG 444 0 

20 

AT CTTT ATT A AATGAATTAT TTTTAAGTGT ATCAAATTCA AACGCATCTG CAACTTTTTC 4 50 0 

TAGTAAATTG TAACGGTAAT CATCAATAAA GTTACGTACG CCAATTTCAG TTACCATATA 4 56 0 

CCAACCGTTA AAGGGTGCAG TTGGATATAC AATGCCACCG ATTTTTAAGT CCATATTGGA 4 62 0 

25 

AATGATAGGG ACTGCATACC ATTTTAAGTT CAATTTTCTT AATTTTGGAT AATGATTATG 4 6 80 

TTCAATAGGT ACTTCTTTAA TTAATGAAGT AGGATATTCG TAAAATTTAA CTGACTCATT 4 74 0 

3Q AGGTAATTGG TAAATCAGTG GTAACACGTC AAAATTAGTA CCTTTTCCTT TCCAACCTAA 4 800 

GTGATTTGCT AAGCGTGTAA CTTCTTTTTC AGCAGGATCA CCACAATTGT CATAGCCAGC 4 860 

ATAGCGAATT AATTGATTGT TGAAAATTTT AGGTCCATCC TTTGGAGCAT ATATAGTAAT 4 92 0 

35 ATACGGCTTT AATTTACCTT CATTTGTAGC CTGTGTAATA TGATAAGTAA TTGATGATAA 4 980 

GAACGATGCT TCGTCAGTAA CATCTCTTGC ATCAATGACA TTTAACGAAT CCCAAAATAA 504 0 

ACGACCAATG CAACGATTTG AATTACGCCA AGCCATTTTA GCACCATAAA TAAGTTCTTC 510 0 

40 ' 

TTCTGTATGT GTATATGTCC CAGTTTCTTT TATTTCTAGT TCAATGTCAT GTAAACGTTT 5160 

ATTGATAATT TGCGTTTCAT AATGACACTC TTTATACATG TTTTCTATGA AAGCTTGAGC 52 20 

CTCTTTAAAT AACATTAACA ACACCTCGCT TTATATTATA GTCTACATTA TTAAAATACT 52 80 

45 

CTTAAAAATT ATGTATATGT CATTAAATTG TTGGTTGATT TTAATTAAAA GTATGGAAAT 53 4 0 

TAAGGGGCTC TTATGTATAT AAAAAAATGA ATTATGATAA AATGTAAGAA AATATTTAGG 54 00 

50 TCGATTGGAG AGATACAAGT GTAC CAATTA GAAGACGACA GTTTAATGTT ACATAATGAC 54 60 

TTATATCAAA TAAATATGGC TGAAAGTTAT TGGAATGATA ATATTCATGA AAAAATGGCT 5520 

GTATTTGATT TGTATTTTAG AAAAATGCCA TTTAATAGTG GCTATGCTGT TTTTAATGGT 55 80 

55 
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TTAAAGTCTA TTGGCTACAA GGATGATTTC TTATCATATT TAAAAGATTT AAAATTCACA 5700 

GGCAG CAT CC GTTCGATGCA AGAAGGCGAA TTATGCTTTG GTAACGAACC ATTGTTACGC 5760 

5 

GTAGAAGCAC CATTGATTCA AGCGCAATTA ATAGAAACAA TTTTATTAAA CATTGTAAAT 582 0 

TTCCATACAT TAATTACAAC AAAGGCTAGC AGAATTCGTC AAATTGCATC AAATGATAAA 5880 

TTAATGGAGT TTGGTACACG TCGTGCGCAA GAAATTGATG CAGCATTGTG GGGCGCTAGA 594 0 

w 

GCTGCTTACA TCGGGGGCTT TGATTCTACA AGTAATGTTA GGGCGGGGAA ATTATTTGGT 6000 

ATACCTGTGT CTGGTACACA TGCACATGCA TTTGTCCAAA CTTATGGAGA CGAATATGTT 6060 

15 GCCTTCAAAA AATATGCTGA AAGACATAAA AATTGTGTGT TCCTAGTAGA TACATTCCAT 6120 

ACTTTAAAAT CTGGCGTGCC AAATGCAATA AAAGTTGCAA AAGAATTAGG TGACAAAATT 6180 

AACTTTGTAG GTATTCGATT AGATTCTGGA GATATCGCTT ATTTATCTAA AGAGGCAAGA 624 0 

20 

CGTATGCTTG ATGAAGCAGG ATTTACTCAA ACTAAAATTA TCGCGTCTAA TGATTTGGAT 63 00 

GAAGAAACGA TTACGAGTTT GAAAGCACAA GGTGCAAAAG TAGATTCTTG GGGCGTTGGT 63 60 

ACAAAGCTGA TTACAGGATA CGATCAACCA GCATTAGGTG CAGTATATAA ACTTGTAGCT 64 2 0 

25 

ATTGAAAATG AAGATGGTTC ATATAGTGAT CGTATTAAAT TATCAAATAA CGCTGAAAAG 64 8 0 

GTTACGACGC CAGGTAAGAA AAATGTATAT CGCATTATAA ACAAGAAAAC AGGTAAGGCA 654 0 

30 GAAGGCGATT ATATTACTTT GGAAAATGAA AATCCATACG ATGAACAACC TTTAAAATTA 66 00 

TTCCATCCAG TGCATACTTA TAAAATGAAA TTTATAAAAT CTTTCGAAGC CATTGATTTG 66 6 0 

CATCATAATA TTTATGAAAA TGGTAAATTA GTATATCAAA TGCCAACAGA AGATGAATCA 672 0 

35 CGTGAATATT TAGCACTAGG ATTACAATCT ATTTGGGATG AAAATAAGCG TTTCCTGAAT 678 0 

CCACAAGAAT ATCCAGTCGA TTTAAGCAAG GCATGTTGGG ATAATAAACA TAAACGTATT 684 0 

TTTGAAGTTG CGGAACACGT TAAGGAGATG GAAGAAGATA ATGAGTAAAT TACAAGACGT 6900 

40 

TATTGTACAA GAAATGAAAG TGAAAAAGCG TATCGATAGT GCTGAAGAAA TTATGGAATT 696 0 

AAAGCAATTT ATAAAAAATT ATGTACAATC ACATTCATTT ATAAAATCTT TAGTGTTAGG 702 0 

TATTTCAGGA GGACAGGATT CTACATTAGT TGGAAAACTA GTACAAATGT CTGTTAACGA 70 80 

45 

ATTACGTGAA GAAGGCATTG ATTGTACGTT TATTGCAGTT AAATTACCTT ATGGAGTTCA 714 0 

AAAAGATGCT GATGAAGTTG AGCAAGCTTT GCGATTCATT GAACCAGATG AAATAGTAAC 7200 

so AGTCAATATT AAGCCTGCAG TTGATCAAAG TGTGCAATCA TTAAAAGAAG CCGGTATTGT 72 6 0 

T^TTACAGAT T7TCAAAAAG GAAATGAAAA AGCGCGTGAA CGTATGAAAG TACAATTTTC 73 2 0 
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TAAACGACAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7 56 0 

5 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 762 0 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 76 8 0 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 7 74 0 

10 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7 8 00 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 786 0 

15 CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 792 0 

GGTTGAAGCA GTTTGGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 798 0 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 8 04 0 

20 AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 816 0 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 822 0 

25 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 82 80 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 834 0 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 84 0 0 

30 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCGTATGA 84 6 0 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8 52 0 

35 TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 858 0 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 864 0 

TGAfiAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 870 0 

40 CGTTTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGAAG 876 0 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8 82 0 

CAAGATGCTC GATCCATATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 8880 

45 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8 940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AGAGACCTTT AATAAGATTA ATAGTTTACT 90 00 

5Q TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 90 6 0 

CCAAAGCTTC TGTGTTCATA ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 912 0 

CAATAACAAT TG AG CTATT A TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 918 0 
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15 



CATTTAAATC TTGAGGATGC CATTCTCCCT CAATAATATT AAGATAATAC TTAGCCTCTG 9300 

AATTACATTT GAATTTATCA ATACTAAATA ATTCAATTTG TTCCATAATA TTATTTACCT 93 6 0 

TTCTAAAATA CAAATTTTAA TAACCATAAA TAGATGAATA CCATCGATAA TGGTCGCCAT 9420 

TGGATACTGG AATAACATTG TTTTTAGCAT CTTGAGTCAT AAAACCATTA TCCCATGGAT 94 80 

TCCATATAAT TATAACCTCT TGTCCATTAT CTAATTTAGC GTTCCCAACA ACTGCCATGG 954 0 

CATGCCCTGC GTGCATACCA TTTCTTGATT CTACTCTACT ACCTAAAACA GCAATTCCTT 9600 

TATTATTTTT AGTAAGATTG TCAACTTCAT TATATGTAGT CATTCTATTA AGAAGTTGTG 966 0 

GACTTCTTCC CTGAGTTTGT CCAAAATAAA TCATCTCTCT TGGCGTTAAA CCAGTAAATT 972 0 

GGAATCGTTG TCCTTGTAAG TTTGGGTGTA AAAATCTCAT CACAGCTTCT GCATGATATT 9780 

TGTTAGTATT ATAAGTCGCA TTTAGTAATT CAGACATCGT ATAGCCTGCA CACCAACCAT 984 0 

20 TGTTACCTTG AGTTTCTCTT ATCTTGAAAT TCTCAAGTTT ATTTATATAT TGsTCGTTGT 9900 

AAGTATAATT ATTACTTTTA AATTGACTAG TTGGCATAGT GACAGAAGCT TTTTGCTTTA 9960 

GTTGCGTTAC ATTATTGCCA GTAGGTATAC TCTCAGTCTT TnTnAACTnT nTATCTTCTA 10020 

GACGTGGTGT TTTTAGTACT AGTTTAGCTT TATGATTTTG AGTACCACAT AGTAACCTTT 10080 

TGAGTTGT 10088 
(2) INFORMATION FOR SEQ ID NO: 33: 



25 



30 



35 



45 



50 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 



r (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

40 CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TTCATAGCTT 6 0 

TGAGGTACCA GAAGAACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT 12 0 

TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 18 0 

TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA 24 0 

ATTGCAGCTT ATGCCTTTAG ATAATTTGAA AAATGTAACG GGATACATTC GTGGTGGGTG 3 00 

TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTGAAAATTA 36 0 

TAGTCATATC AGTGTGAGTG GTGGGCTTCG AACAATGCAA AT CACAATAG CTGTTGAGGA 42 0 
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TGCCACACTC CTTTTTGATT GAATTAGCAT TTTACGATCA TAAACAGTCA TTATAATTGA 600 
GTATTTGAAC ATAAAAATGT AATTTTATCG TAACAATTTG AGTGTTTGTG ATTGTTTTTG 660 
GTAATTTATG ATTGAAAAGT GAAAGCGTAC TCATTATAAT ACAAAGTGAG ATGGGGTGAT 720 
GATGATAATT ACTGaAAAAA GACACGAGTT AATATTAGAA GAACTTTCGC ACAAAGATTT 78 0 

TTTGACTTTA CAAGAATTAA TAGATCGAAC TGGTTGCAGT GCTTCAACAA TACGArGAGA 84 0 

TTTATCTAAA CTACAACAAT TAGGGAAATT GCAACGTGTG CATGGTGGTG CAATGTTAAA 90 0 

AGAAAATCGT ATGGTTGAGG CGAATTTAAC TGAAAAATTA GCAACGAATC TTGATGAAAA 96 0 

GAAAATGATT GCTAAAATAG CAGCTAATCA AATCAACGAT AATGAATGCT TATTTATCGA 102 0 

TGCTGGTTCA TCTACATTGG AGCTAATTAA ATATATTCAA GCGAAAGATA TCATTGTGGT 1080 

AACCAATGGT TTAACACATG TAGAAGCTTT ACTTAAAAAA GGTATTAAAA CAATTATGCT 114 0 

20 AGGTGGTCAA GTTAAAGAAA ATACACTTGC TACGATTGGT TCTAGTGCTA TGGAGATATT 1200 

AAGACGATAT TGTTTCGATA AAGCTTTTAT CGGGATGAAT GGATTAGATA TTGAACTTGG 126 0 

ATTAACTACT CCCGATGAGC AAGAGGCATT AGTTAAACAA ACAGCAATGT CATTAGCCAA 132 0 

TCAATCATTT GTACTTATAG AT CATTCTAA GTTTAATAAA GTATATTTTG CTCGTGTACC 13 8 0 

TTTGCTAGAA AGTACGACAA TCATCACATC TGAAAAAGCA TTAAATCAAG AATCGTTAAA 14 4 0 

AG AAT AC CAA CAAAAGTATC ACTTTATAGG AGGGACTTTA TGATTTATAC AGTGACTTTC 150 0 

AATCCTTCAA TTGACTATGT CATTTTTACG AATGATTTTA AAATTGATGG TTTGAACAGA 1560 

GCAACAGCAA CATATAAATT CGCTGGGGGG AAAGGTATTA ATGTCTCGCG CGTCTTAAAG 162 0 

ACATTGGATG TTGAGTCAAC TGCCTTGGGA TTTGCAGGTG GATTTCCTGG GAAATTCATT 16 8 0 

ATAGATACAT TAAATAACAG TGCAATTCAA TCGAATTTTA TTGAAGTTGA TGAAGATACA 174 0 

CGTAJTAATG TGAAATTAAA AACAGGACAA GAAACAGAAA TCAATGCACC GGGTCCTCAT 18 00 

40 ATAACGTCAA CACAATTTGA ACAACTGTTA CAACAAATTA AAAATACAAC AAGCGAAGAT 186 0 

ATAGTTATTG TTGCTGGAAG TGTACCAAGT AGTATTCCAA GCGATGCGTA TGCGCAAATT 192 0 

GCACAAATTA CAGCACAGAC AGGTGCTAAA TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 198 0 

GAAAgCGTTT TACCATATCA TCCACTATTT ATTAAACCTA ATAAAGATGA ATTAGAAGTG 204 0 

ATGTTTAATA CAACAGTGAA CTCAGACACA GATGTTATTA AATATGGTCG TTTGTTAGTT 210 0 

GATAAAGGTG CGCAATCTGT TATTGTCTCG CTTGGCGGTG ATGGTGCTAT TTATATTGAT 216 0 

AAAGAAATCA GTATTAAAGC AG IT AAT CCA CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 222 0 

GGTGATAGTA CAGTTGCAGG CATGGTGGCT GGAATTGCTT CAGGTTTAAC GATTGAAAAA 22 8 0 
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CGGGACGCTA TAGAAAAAAT AAAATCACAA 
AAATAATGAG AGTAACAGAG TTATTAACAA 
5 ATGACAAAAA TGGTGTTATT GATGAGTTAG 

GTGATGTCGC GTCATTTAAG GAAGCGATTC 
TCGGCGAAGG TATTGCCATT CCACATGCCA 

w 

CGTTTGGTAA ATCTAAAGCA GGCGTAGATT 
TATTCTTTAT GATTGcAGcG CCAGAAGGTG 
AGTTGTCTGG TATTTTAATG GATGAAAATG 

15 

CTGAAGAAGT ACTAGCGATC ATAGATGAGG 
CAGAAGCTGA AGCACAACAA GTTGCAACTG 

20 CATATGTGTT AGCAGTAACT GCTTGTCCAA 

ATGCATTGAA AAAGCAAGCG GATAAAATGG 
CAAGCGGCAT TAAAAACCAT TTAACTGAAC 

25 TTGCTGCTGA TGTTCATGTT GAGACGGATC 

CAGTAGCAGA TGGTATTAAA CGCCCAGAAG 
GTAAACCTTT TGTTGCCCGT GATGGTCAAA 

30 

AATTAAGCCC AGGTAAAGCA TTCTATAAAC 
CACTTGTAAT ATCTGGTGGT ATTTTAATGG 
TTAATCCAAA AAGCTCAGAG TACAATGCGT 

35 

AAAGTGCATT CGCGTTAATC ATTCCAATTT 
ATAAACCTGG TTTCGCTTCA GGTCTTGTAG 

40 GATTTATTGG TGGTATTATT GCAGGTTTCT 

CCATGACACG TAAGTTACCA CAAGCATTAG 
TATTAACAGT GACGGCTACA GGCTTATTGA 

^ GGTTAAATCA TTTGTTATTA GATGGATTAA 

TAGGTTTAGT TATTGGCGCT ATGATGGCGA 
CATATGTTTT TGCAACAGGT GCGTTGATTG 

50 

TGATTGGTGG TATGATTCCA CCGTTAGCAA 



GTTACGATTA GCGTACTTGA TGGGGAGTGA 24 00 

AAGATACAAT AGCAATGGAT TTAATGGCAA 24 60 

TAAATCAATT AGACAAAGCA GGTAAATTAA 2520 

ACAATCGAGA ATCACAAAGT ACAACTGGTA 2 580 

AAGTGGCCGC AGTTAAGTCA CCAGCTATTG 2 64 0 

ATCAAAGTTT GGATATGCAA CCAGCACACT 2700 

GCGCCCAAAC ACATCTAGAT GCTTTAGCTA 2760 

TACGTGAGAA ATTATTACAT GCTTCATCAC 2820 

CTGATGATGA AGTGACAAAA GAAGAAGAGG 2 880 

CAGAACAATC ATCTAAACAA TCTAATGAGC 2 94 0 

CAGGTATTGC ACACACATAT ATGGCACGTG 3000 

GTATTAAAAT TAAAGTAGAA ACGAATGGTT 3060 

AAGATATTGA AAATGCAACA GGTATCATTG 3120 

GCTTCGATGG TAAAAATGTC GTAGAAGTAC 3180 

AATTAATTAA TAAAGCATTA GATACAAGTC 324 0 

GAAAAGGTAA CTCAAATGAC AGTCAAGAAA 3300 

ACTTAATGAA CGGTGTTTCT AACATGTTGC 3360 

CAATTGTATT TTTATTTGGA GCAAATTCAT 3420 

TTGCAGAGCA GCTTTGGAAC ATTGGTAGTA 34 80 

TATCTGGATT CATTGCACGT AGTATTGCGG 3 54 0 

GTGGTATGTT AGCAATTTCA GGTGGTTCAG 3600 

TAGCAGGTTA CTTAACACAA GGTGTTAAAG 36 6 0 

AGGGATTAAA GCCAACATTA ATTTATCCAC 3 72 0 

TGATTTATGC CTTTAATCCA CCAGCATCTT 37 8 0 

ACAATTTATC AGGTTCTAAT ATTGTATTAT 3 84 0 

TTGATATGGG CGGTCCATTC AACAAAGCGG 3 900 

AAGGTAATGC AGCACCAATT ACAGCTGCAA 396 0 

TTGCGACAGC GATGTTAATT TTTAGACGTA 4 020 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG 
CACATGGTGG TATTATTGTA ATTGTTGGTA 
5 TTGCACTTCT AGTTGGCACA TTAGTTTCAG 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT 
TGATTGTTAG CAAAGAGCTT CATATTAAGT 

10 

TATATCGTGT TAACGGTAGC TTATACAAAG 
TTATGAATTG ATATGAAAGT GTTTTTATTT 

15 CAAATGTATA GACTTTTTTA ATATTTTGCA 

AAAATATGAG TGTCTTAAAG TGAAAATTTA 
TTAATTATAT ATAACGGCAA AGTTTATACT 

20 CATGTGAAAG ATGGACAGAT TGTTGCAATT 

AATGATACGA CAAATAAAAT TCAAGTGATT 

TTTATTGATA TACATATTCA TGGTGGTTAT 

25 

GGCTTAAAAT ATCTATCCGA AAATTTGTTG 

ACAATGACGC AAT CGACTGA TAAAATAGAT 

GCGGAgCAAG ATGTTCACAA TGCAGCGGAA 

30 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT 
AAAATTAAAC ATTTTCAAGA GACTGCTAAC 

35 GAAATTGAAG GTGCAAAAGA AGCGCTTGAA 

GGTCATACAG TAGCAACATA CGAAGAAGCA 
GTCACGCATT TATATAATGC AGCGACGCCA 

40 GCAGCATGGT TGAATGATGC TCTACATACC 

CCGGCATCGG TTGCAATTGC TTACCGTATG 
GATGCAATGC GTGCAAAAGG TATGCCTGAA 

45 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA 
ATGAATCATG GGTTACGTAA CTTAATATCA 
CGAGTAACAA GTTTAAATCA AGCCATTGCA 

50 

AAAGTAAATA AGGATGCAGA TCTTGTTATT 
ATAAAACAAG GCAAGGTTCA CACATTTAGC 

55 



CTTTAGGCTT AGGTTCACGA ATTACTGCGC 4 2 00 

CTGATGGTGC ACACTTACTT CAAACTCTTA 42 6 0 

CATTAATTTA CGGTTTAATC AAACCAAAGT 4 3 20 

CAATGGACGA GTAGTTTTAA TGATGTAAAA 4 3 80 

TGTATGTTCA ATGAATATAT GTTAGTTTTA 4 44 0 

CTGTAAAAAC ACTTTCTATT AATTCAGTTT 4 500 

TTAGATAAAT GAATGAAGAA ATAGACACCA 4 560 

AAAAGTTATG CCAAACGAAG CAGATATAGT 4 620 

TAAATAAAGA AGGGTTTATA CGTGTCAGAA 4 6 80 

GAAGATGGCA AAATCGATAA TGGTTACATT 4 74 0 

GGAGAAGTGG ATGATAAAGC AGCAATTGAT 4 800 

GATGCTAAAG GTCATCATGT ATTACCAGGT 4 86 0 

GGTCAAGATG CAATGGATGG GTCATACGAT 4 92 0 

TCTGAAGGGA CGACATCATA CTTGGCCACT 4 980 

AATGCACTTA CAAATATTGC TAAATATGAA 504 0 

ATTGTAGGTA TACATTTAGA AGGACCATTT 5100 

CCG CAATACG TTGTACGCCC ATTTATCGAT 5160 

GGATTAATAA AGATTATGAC GTTTGCACCT 5220 

ACGTATAAAG ATGACATTAT TTTTTCAATT 52 80 

GTTGAAGCTG TTGAGCGAGG AGCTAAACAT 534 0 

TTCCAACATA GAGAACCAGG TGTTTTTGGA 54 00 

GAAATGATTG TTGATGGCAC TCATTCTCAT 54 6 0 

AAAGGTAATG AACGTTTTTA TTTAATTACC 5520 

GGAGAATATG ATTT GGGTGG ACAAAAAGTA 5 5 80 

AATGGTGCGC TTGCTGGTAG TATTTTAAAA 564 0 

TTTACAGGTG ATACATTAGA TCATTTATGG 57 0 0 

TTAGGTATCG ATGATAGAAA AGGTAGTATT 5 76 0 

CTAGATGATG ATATGAATGT AAAATCTACA 582 0 

TAATAAATAA TCATAATTAA ATGTATGCAA 588 0 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6 000 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 6 060 

5 AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 6120 

TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 6180 

AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AG ATGATTG C 624 0 

10 

TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6300 

AGGTTGGCTT GGTGAACCAA CGTTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 63 60 

TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTACGTA 6420 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 64 80 

GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 654 0 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 6600 

CCAAACTGAT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 6660 

TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

25 ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 6780 

TGTAGACGAA TT ACT AG AAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 684 0 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6 90 0 

30 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 696 0 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 702 0 

TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 

35 

AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 714 0 

TGAT^ATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 720 0 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 726 0 

TAATACCAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTGTTTC 73 2 0 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAGCGAG 73 80 

ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 744 0 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 75 0 0 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 756 0 

50 

CGA 7563 
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(A) LENGTH: 34 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

Cxi) SEQUENCE DESCRIPTION : SEQ ID NO: 34: 

W TTATATCAAC TTCATGGCGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 6 0 

sATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 120 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

15 ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 24 0 

GCAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 3 00 

AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT 360 

20 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 420 

GTGCTTGGTT CCATCATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 4 80 

AACTATCATA CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAGAGCTCG TGTAAAGGAT 54 0 

25 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AG CAGCATGA 6 00 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATTCGC TATAATTTAT 66 0 

30 GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 72 0 

TAATTTCGCC TTGCGGCATT GCCTTAAGCA AACTTCTGCC ACTTCATCTC TTAATAATTT 78 0 

TATTAAAACA TCTTTCTATA TTTCACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

35 TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA ATATCGCCTA AAATATCCAG CACTGTAAAT 960 

TCTTCAAATA CTGATAGTTG TTCCGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 102 0 

40 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 108 0 

CGTTCAAAAA ATCT AGGCGC AATTTGATAC ATTTTCAACG CATGaTGCAT CCATTTAGGC 114 0 

CGATTAATTT CCAATTGTTT TGTTTTAATG CCATAAATGA TATCTTCTGC AAGCTGATTA 12 0 0 

45 

GCATCAAGCA TAATTTCCCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 126 0 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 13 2 0 

50 AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATGCCCCATA ATGTGCAGCA 13 8 0 

TTTGCTTGTG TGGAAAATGC AGCTTGACTT GAAATACCTA CAATATGTGC GTTAGATGTT 14 4 0 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 15 0 0 
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TAAATGAATC CATCGAATGA TGTATTGTCT TCAAATTGCA GTGCCTGTAT CGACTTCAAA 1620 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 16 80 

5 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 174 0 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 18 00 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA CATAATTGAG 1860 

w 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATGrAAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 198 0 

j S AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGC CAATGAT ATATATTTAA 2 04 0 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 2100 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

20 ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT GGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 2280 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGGCATTA 234 0 

25 GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 2400 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT ATCACCGGAA 2460 

GCGGCCCAGC ATTTTTATAT CATGTATTCG AGCAATATGT TAAAGCTGGT aCsAAACTTG 2520 

30 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 25 80 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 264 0 

GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2700 

35 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 276 0 

AACASACCCG CCAACACATG TATGCATCAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2 820 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTAAAACTAT 2B80 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2 94 0 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 300 0 

4 ^ S CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 3 06 0 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 312 0 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

50 

G7TTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 3 24 0 
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25 



CATGATTAAG TAAATGCGCC TCTACAGTAA AACCATCCAT GATGATATGT CAGATGATCA 342 0 

TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 8 0 

5 TCCACATATG CT 3 4 92 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

75 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 60 

20 CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTG CACAAACTTA 120 

TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 18 0 

CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 24 0 

AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGCCATTA ATCAATTTAA 3 00 

TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 3 60 

TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 42 0 

AGGATTAGCT TTTGTAGCTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 4 80 

GCCAAAATTT TATCTAGACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 54 0 

GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 60 0 

AGAAGGTTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAG CGC 66 0 

CTTAAAAGCA TTAGATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 72 0 

AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC mAAGATGAAC TTAAAAATnG 7 80 

CTTTAAAATA ACAATTGCnG GTGGTCAAGG CCATCTTAAA GGTCAAATTT TnAGAATTGG 84 0 

TCATATGGGG AAAATTAGTC CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 900 

TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 960 

TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 1020 

CATTATTAGA TCACGAACAA TTCAATGTAG ATATTCAAAC TGGCTTGTCC GAAGAAGCAT 10 8 0 

TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 114 0 

AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGGTGTAG 12 0 0 
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GTAATACGAT TTCAGCTACT GAACATACAC TGGCAATGTT 
TTCCGCAAGC ACACCAATCA CTTACAAATA AAGAATGGAA 

5 

CTGAGCTTTA TCATAAAACA TTAGGTGTCA TTGGTGCTGG 
CTAAACGTGC GCAAAGTTTC GGAATGAAAA TACTAGCTTT 
AAAAAGCAAA ATCTTTAAGC ATTACGAAGG CAACAGTTGA 

w 

ATTTCGTTAC ATTACATACA CCACTAACAC CTAAAACAAA 

TTTTTGCCAA AGCAAAACCT AGTTTGCAAA TAATCAATGT 
15 ATGAAAAGGC GCTAATAAAA GCATTAGACG AAGGACAAAT 

TGTTTGAACA TGAACCTGCA ACTGACTCGC CTCTTGTTGC 

CACCTCATTT GGGTGCTTCA ACAGTCGAAG CTCAAGAAAA 
20 ATGAAATCAT CGAAATTTTA ATTGATGGTA CTGTAACGCA 

TGGACTTAAG CAATATAGAT GATACTGTAA AATCATTCAT 

(2) INFORMATION FOR SEQ ID NO: 36: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7620 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 36: 
GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 6 0 

35 

TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAGGCCT 12 0 

AAAT^AACAA AATAAAGAAG TACTAACAAA ATATTAAGAC CCATCGGCAT TAATGTAAAA 18 0 

40 TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 24 0 

GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 3 00 

GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 3 60 

45 ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 4 20 

AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 4 80 

GCCATATACC AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA S4 0 

50 

AATAAAACGA ATGATTTCAT AAAACCTACT TGAGGTAATT GTTCCATTGT AATCTCCCTT 6 00 



ATTATCAATG GCACGAAATA 13 20 

TCGAAATGCA TTTAAAGGTA 13 8 0 

TAGAATTGGT TTAGGTGTTG 144 0 

TGACCCTTAC TTAACGGATG 15 00 

TGAGATTGCC CAACATTCTG 156 0 

AGGCTTAATT AATGCTGTCT 162 0 

GGCACGTGGT GGTATTATTG 16 8 0 

TAGTCGGGCA GCTATCGATG 1740 

ACATGATAAA ATTATTGTTA 1800 

AGTGGCAATT TCTGTTTCAA 1860 

TGCAgTGAAT GCACCTAAAA 1920 

CAATTTAAGC CAA 1973 
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GTAAAATGAA AACCCGCTAC AAGTACACAT CTATATGGAG ACTCATTTGA AAGTCAACGC 7 80 

TTCGTTAACT ATACTAAAAA TATGTCATAC TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 840 

5 

CTATGCAAAT AAAATATTCC ATAACAAAGT ATATACTTTA CATTTTTATA ATTCTTAACA 900 

ATACTATTTT ATCAAACATT TACCACAATA AAAATATCTT TTTCATTTTT ATTTAAATTA 960 

ATCATATAAT TGCGAGGAGA ATATTATGGA TTTCGTTAAT AATGATACAA GACAAATTGC 102 0 

10 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA TCAGGATACC ACTCAAACGT ATACAGGCTA 10 80 

CATCGTGGAA ACGGAAGCTT ACTTAGGTTT GAATGATCGT GCGGCTCATG GCTATGGCGG 114 0 

15 TAAAATAACA CCTAAAGTCA CGTCATTATA TAAACGTGGT GGTACAATTT ATGCACATGT 1200 

CATGCATACG CATTTACTCA TTAATTTTGT AACAAAATCT GAAGGTATAC CTGAAGGCGT 12 6 0 

ACTTATCCGC GCAATTGAAC CAGAAGAAGG TTTATCCGCT ATGTTCCGTA ACAGAGGTAA 13 2 0 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG AAAATGGACT AAGGCATTTA ACATTCCACG 13 80 

GGCTATCGAT GGCGCTACGT TAAATGACTG TAGATTGTCT ATTGATACTA AGAATCGTAA 144 0 

ATATCCTAAA GATATTATTG CTAGTCCACG AATCGGTATT CCAAATAAAG GTGATTGGAC 1500 

25 ACATAAATCT TTACGTTACA CAGTGAAAGG TAATCCATTT GTGTCTCGCA TGCGTAAATC 156 0 

AGATTGTATG TTTCCCGAAG ATACTTGGAA ATAAATGCCA TCTTTCATTG ATTACTATCA 162 0 

TGAAAATGAA ATCTATCTCC TTATAAGTCA ATCAATCGTG CCGTCAACAT GCGGATGGGT 1680 

30 

TGATTGTTTT TCTTTGTATC CATCATATTT TTTGATTCAT CTCCTCTTAT TGAACTTGTT 174 0 

CTTAATTATA AAATATAACA ATAGAATTAT TTATAATTAT TAAATTTAGA TGCATTAATA 18 0 0 

TTATTGATAT TATTTTCAAA AACTAGAAAT ATTGATTTGT TGCATGTATA ATGTTAAAAG 18 60 

35 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC TTATTTAGGG AGAGGGATAT TCAACAAGGG 192 0 

GGATTTGAAA ATGATAGAAC TTAATGCAAT TACAACATTA TGTTTAGCTT GTATCCTTTA 198 0 

40 TTTACTTGGT AAGGCTATCG TTAATCACGT TAATTTTTTA AAACGTATTT GTATACCAGC 204 0 

ACCAGTGATT GGCGGCTTAA TCTTTGCTAT TTTAGTTGCG GCTTTGGATT CATTTGGCAT 2100 

GGTTAAGATT AAATTAGATG CTTCATT CAT TCAAGATTTC TTCATGTTAG CATTCTTTAC 216 0 

45 GACAATCGGT CTTGGTGCAT CATTGAAATT ATTTAAATTA GGTGGCAAAG TCTTGCTATT 222 0 

ATACTTTATG TTTTGTGCTA TCATTTCAGT CATTCAAAAC ATAGTTGGTG TATCACTAGC 22 80 

AAAAGTATTA AATATTAAAC CTTTGTTAGG ATTAACAGCA GGTTCCATGT CTATGGAAGG 234 0 

50 

CGGTCATGGT AATGCTGCTG CTTATGGTAA GACAATTCAA GATTTAGGTA TTGATTCGGC 240 0 

ACTGACAGCG GCTCTTGCAG CTGCAACTTT AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 246 0 
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ATTTAAAGAT TATAGCCAAG TAGCATATAA CGAACATTTA CATAGTAAAT TTAATGCCAC 2580 

TGAAGTATTC TTCATTCAAT TTACAATCGT TGTATTCTGT ATGGCAGTTG GAAGTTATTT 264 0 

CAGTCATTTG TTTACAGCTC AAACAGGGAT TAATGTTCCA ATTTACGTTG GCTCATTATT 2 70 0 

TGTAGCTGTT ATTGTCCGAA ATATCTCTGA AAGTTTTAAT TTTAATATTG TAGATTTAAA 276 0 

AATTACTAAT CAAATTGGCG ATGTCGCATT AGGTATTTTC TTATCTCTTG CGCTAATGAG 2 82 0 

10 

CATTCAATTA ATCGAAATTT ATAAACTTGC TATACCTCTT ATT ATT AT CG TTTTAGTTCA 2B80 

AGTTGTCGTT ATGATTTTAT TTGCTGTTTT AATTTTATTT AGAGGTTTAG GAAAAGATTA 2 94 0 

r5 TGATGCTGCA GTAATGGTAG GTGGTTTTAT CGGTCATGGG CTTGGTGCAc GCCAAATGCC 3 000 

ATGGCAAATT TAGATGTTAT TACTAAAAAA TATGGAAACT CACCTAAAGC ATATTTAGTT 3 06 0 

GTACCTATTG TTGGTGCATT CTTAATCGAT TTAATTGGTG TTATAGTCAT TATGGGATTC 312 0 

20 ATACAATGGT TTAGTTAAAC ACCAAACTCA TAAATAAAAG AGGAGGCCTT CGCCTCcTcT 318 0 

TTTATTTATC CTCGATGTAT ATTCAAGTTA CGTTGTTCTA TCCATGACAA TATTTCCGGA 3 24 0 

CTAAATACGA TTTGTTTTTG TGTTAAGTCG TCAATATTTT TAGCATCTAA CATCGTCATT 3 3 00 

25 ATTGATTTCA TGTGTTCAAT AAATGATTCT ACATAAGCTA CTGTATGTGC AATGCCATTA 3360 

TTTTCAACTT GATTTAAAAA CGGACGTGAC ATACCAGTTG CCTTTGCACC AAGTGCTAAA 3420 

CTTTTAATTG CATCGAGTGG TGTACGTAAA CCACCACTCG CGAAAACTGA AATTTCGCTT 34 80 

30 

TGATAAGCCG TTGTTTCAAG TAATGACTCA ACTGTAGACT GTCCCCATGA TGATAAGTAA 3 54 0 

TCCATATCTT TATTTGCACG ACGTTCATTT TCAATATCTA CAAAGTTAGT ACCACCTTTG 3600 

35 CCACTAACAT CGACATACTT GACGCCTATT TGTTGTAAGT CATGCATTAA TTCTTTGCTC 366 0 

ATACCAAATC CAACTTCTTT TATAATGACT GGAACAGACA CTCGTGATAC AATCGACGCT 372 0 

ATATTATCTA ACCAAGTCAC AAATTCACGA TTCCCTTCAG GCATAACTAA TTCTTGAGGA 378 0 

40 GAATTAACAT GGATTTGTAA CGCTTGTGCC TCAAGTAATT CAACTGCTTC CAAAGCCTTT 3 84 0 

TCTACTGGTA CGTCCGCACC AACATTGCTA AAAATCATGC CTTCAGGATT CATTTTTCGC 3 900 

GCAATCGTAA ACGTCTCAGC CATGCGTGGA TTTCTCAATG CCGCATGTGT TGATCCAACT 3 96 0 

GCCATCGCTA AGCCAGTTTC TCTTGCAACT ACAGCTAGCT TTTCATTGAT GTTTTTCGTC 4 02 0 

CACTCGCTAC CACCCGTCAT TGCATTAATA TAAACCGGAT ATGCCATCGT TAAGTCAGGC 4 080 

GTCTGTGATG TCAAATCGAT ATCATTTACA TTAATTGATG GGATAGAATG ATGCACAAAA 414 0 

CGCATCTTAT CAAAATCTGA ATGCATTGCG TCAGATTGGG CCATTGCTAT TTCAACATGT 4 200 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 43 8 0 

TTCATACCAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 444 0 

5 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 450 0 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4 56 0 

w ATATGTTGAA TGTATAGCGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4 62 0 

GTATCAATTA GCTCTTGCAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 46 8 0 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 4 74 0 

15 AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAGTTGA AATATTTATC AACCATCATA 480 0 

TCTAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATT CAT ATA C 4 860 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4 920 

20 

TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4 98 0 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 504 0 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 

25 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 516 0 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 522 0 

CCCCCTAAAT AACGAATTCT CTATTATTTT ATCATGAATT AAATAACGTG TATGTCTTAA 5280 

30 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 534 0 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT GCAAATTATA 54 0 0 

35 ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 54 6 0 

AACGCGAAAA TTTAGCATAT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 552 0 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATTATAAGC 55 8 0 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACT AG CAT CA ACAAATGAAG 5 64 0 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5 70 0 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5 760 

45 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTACTAG AITTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 588 0 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 594 0 

50 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTGCTC 606 0 
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ATAATTTTTT AGATCAATTT TATCAAATTA AAGGGCAATA CTTTATCATC ACACATATCA 6180 

ATACACTTAT TGGTGATTTT CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 624 0 

5 

CTATATTTTC AAAAACAAAA CCCAATTACG TTTTCATGTC AAATATCATC TTGCATGAAA 6 3 00 

TCGTAACTGG GTCATTTATA TGTTATTAGT TATTTTGTGT TACATCCTCA TCTATCGATT 63 6 0 

TGGCAATTTG TTTAATAGCT TTATGTGATT GTCTAATTGG ATAAATTGGA AAATCATGTA 642 0 

10 

CCATCTTAGG ATAATCATAA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 64 8 0 

ATAGCTTCAT ATCAGGATGT GTCATTTCAC GTCCACCACC AAACATATAA ACTGGTGGCA 654 0 

15 ATCCTTCTAT TGTGCCATTA ATTGGCGATA TGCGCTTATC TGTTAATGGT AGGCCATTCG 6600 

CCCATTTTTT CATAATCTCA TTGACACCAA ACTGACTTAG aACCGCATCT TGTTCGATTA 6660 

AGGCGTCCGA AATATCTTTA TTAGATAGTG TTGCATCTAA AATTGGTGAG ATTAAATACA 672 0 

20 ATTTATTCGG TAATGGCTGT TGATTAkCTA AAAGAGATTG TACAAAGGAT AATGCCAGTG 6780 

CACCACCTGA ACCATCACCC ATGACTACGA CATTTTGATG TCCTACTTCA GATACTAATT 6840 

GaTCATAAAC ACGTTGTATC GCTTGGnAAA GTATCGTCaA TATGnAAACT CTGGTGTCTT 6 900 

25 TGGATAGATA GGCAGTACAA CCTCATATAA TGtACTTAAA GTGATTTTAT CCCAACAATC 6 96 0 

TCCAATGGAA CGGTGATGGT TGTAGTGCAT TGAATCCACC GTGAATATAT AAAATTTTCT 702 0 

TATCAATTTG ATGTCTGAAA TTAAAGCGAA AGACTTGCAT ATCATCTAAT GACAATTTTT 70 8 0 

30 

CTAAATTTGC TTTAACATTT AATGTTGAAG GCTGCTTATG TTTTTTTCTA TTTTCAATTT 714 0 

CTCTTTTATA AAAAAATCTT TCAACATCTT GATCATTTTT AAACATAATC GAGCGATTGT 72 0 0 

GAAGCAAATA TTTATTGACA ACGCTATTCA TAACACGGTT TCTAATCAAT GTCTTAACCT 726 0 

35 

ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 73 20 

GTAXATAATG TAGAAGATAT TTTCTTTTTC ACTTTCAAAT TTAAGACTAC AATTGAACAG 73 8 0 

40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 74 4 0 

TACGGAGGTA CCTTGCATGA CAAATCCAAA TCAACGATTA GAACCATTTG ATGAGACATT 75 00 

TCAACAACCG AAT ATT CATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 75 60 

45 CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 762 0 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 
so (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GTCATtACCG amTTTCtTAG AaTCATTTAA AGATGATAAA TATACAAACG TTGGTAATTT 6 0 

AAAAGAAGTG AATTTTGATA AAATTGCTGC GACGAAACCC GAAGTAATCT TTATCTCTGG 120 

ACGTACAGCT AATCAAAAGA ATTTAGATGA ATTCAAAAAA GCTGCACCTA AAGCGAAAAT 180 

TGTTTATGTT GGTGCAGATG AAAAGAACTT AATTGGTTCA ATGAAACAAA ACACTGAAAA 24 0 

TATCGGAAAA ATTTACGATA AAGAAGATAA AGCTAAAGAA TTAAATAAAG ATTTAGATAA 3 00 

CAAAATTGCT TCAATGAAAG ATAAAACGAA AAACTTCAAT AAAACTGTTA TGTATTTACT 36 0 

AGTTAACGAA GGTGAATTAT CAACATTTGG ACCTAAAGGT CGTTTTGGTG GATTAGTTTA 4 20 

CGATACATTA GGATTCAATG CAGTTGATAA AAAAGTAAGT AATAG CAATC ATGGACAAAA 480 

TGTTTCTAAC GAATATGTTA ATAAAGAAAA TCCAGATGTT ATTTTAGCGA TGGATAGAGG 54 0 

TCAAGCGATA AGTGGTAAAT CAACTGCGAA ACAAGCATTA AATAATCCTG TATTAAAAAA 60 0 

TGTTAAAGCA ATTAAAGAAG ACAAAGTATA TAATTTAGAT CCTAAATTAT GGTACTTTGC 660 

AGCTGGATCA ACTACAACTA CAATTAAACA AATTGAGGAA CTTGATAAAG TTGTAAAATA 720 

25 ATTTTAAAAG AGGGGAACAA TGGTTAAAGG TCTTAATCAT TGCTCCCCTC TTTTCTTTAA 780 

AAAAGGAAAT CTGGGACGTC AATCAATGTC CTAGACTCTA AAATGTTCTG TTGTCAGTCG 84 0 

TTGGTTGAAT GAACATGTAC TTGTAACAAG TTCATTTCAA TACTAGTGGG CTCCAAACAT 9 00 

AGAGAAATTT GATTTTCAAT TTCTACTGAC AATGCAAGTT GGCGGGGCCC AAACATAGAG 96 0 

AATTTCAAAA AGGAATTCTA CAGAAGTGGT GCTTTATCAT GTCTGACCCA CTCCCTATAA 102 0 

TGTTTTGACT ATGTTGTTTA AATTTCAAAA TAAATATGAT AGTGATATTT ACAGCGATTG 108 0 

TTAAACCGAG ATTGGCAATT TGGACAACGC TCTACCATCA TATATTCATT GATTGTTAAT 114 0 

TCGTSTTTGC ATACACCGCA TAAGATTGCT TTTTCGTTAA ATGAAGGCTC AGACCAACGC 12 00 

TTAATGGCGT GCTTTTCAAA CTCATTATGG CACTTATAGC ATGGATAGTA TTTATTACAA 12 60 

CATTTAAATT TAATAGCAAT AATATCTTCT TCGGTAAAAT AATGGCGACA scgTGTTTCA 13 20 

GTATCGATTA ATGAACCATA AACTTTAGGC ATAGACAAAG CTCCTTAACT TACGATTCCT 13 80 

45 TTGGATGTTC AC CAATAATG CGAACTTCAC GATTTAATTC AATGCCAAAT TTTTCTTTGA 144 0 

CGGTCTTTTG TACATAATGA ATAAGGTTTT CATAATCTGT AGCAGTTCCA TTGTCTACAT 1500 

TT AC CAT AAA ACCAGCGTGT TTGGTTGAAA CTTCAACGCC GCCAATACGG TGACCTTGCA 156 0 

AATTAGAATC TTGTATCAAT TTACCTGCAA AATGACCAGG CGGTCTTTGG AATACACTAC 162 0 

CACATGAAGG ATACTCTAAA GGTTGTTTAG ATTCTCTACG TTCTGTTAAA TCATCCATTT 16 8 0 
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AGTGTTCTTT TTGAATAATG CTATTACGAT AATCTAACTC TAATTCTTTT GTTGTAAGTT 18 0 0 
TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC I86 0 
CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 192 0 
CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG TGAGACATCA ATAATTGCAG 1980 
CGCCGCTACC GGCTATTATC GCATCATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 204 0 
TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 
ATGTAACAGG AATCTCATTT TGaTAGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2160 
TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 222 0 
AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG TTGTAAAGCT TGATAGATGT 22 80 
CTTTATTTAT CACTTCTCAG TACATCCTTT CTCATGTCTT TAATATCATA TAGTATTATA 234 0 
CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA 24 0 0 
AAATACGGCA TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTCAAAG TATTGTTGCT 24 6 0 
TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 252 0 
25 TGAATTGCAA AGCAATATCA TCATTAGTTG ATAAGAGGTA ATCAAGTGCA AGATAAGATT 25 80 
CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACGCAC CTGTTGTTTT AGTTCATGAA 264 0 
AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 27 0 0 
CTACATAATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 2760 
CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGctACCGC CAATATGTTT 282 0 
TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 2 880 
TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 2 94 0 
AAACTcCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA 3 000 
ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATGAATCT GTTAAGGAGT GCAATCATGA 3060 
AAAAAATTGT TATTATCGCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 312 0 
ATAAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAATT TAAAGATGTT GAGCAAAAAC 3180 
AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 3 24 0 
AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 3 3 00 
ACCATCTCAT ACCTAAATTT GAAGCATATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 3 3 60 
CAATGAAAGT TAAGAAATTA AAAAAAGAAT ATATGACGCT TGCAAATGAG AAGAAGGATG 34 20 
. m T-*ij\AAAAA TTCATAGGTT TATGTAATCA ATCTATCAAG TATAACGAAG 34 8 0 
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AATTAGCTGA 


TAATAAAAGT 


GAAGCAACTA 


ATCTTACGAC 


AAAATTAGAA 


CATAATAATA 


3600 




AAGCGTTAAG 


AGATACTGCG 


AAGAAGAACC 


TAGATGATAG 


TAAAGAAAAT 


GAAGTAAAAG 


3660 


5 


GCGCGATTAA 


AAATCACATT 


ATGCCAATGA 


TTGAAAAGCA 


AATTACCGAT 


ATTAACCAAA 


3720 




CTAATATTAG 


TGATAAGCAT 


GTTAATAATG 


CAAGGAAAAA 


CGCAATAGAA 


ATGTATTACA 


3780 


10 


GTCTGCAGAA 


CTATTATAAT 


ACACGTATTG 


AAACAATAAA 


GGTTAGTGAG 


AAGTTATCAm 


3840 


AAGTCGATGT 


AGATAAGTTG 


CCGAAAAAGG 


GTATAGATAT 


AACTCACGGC 


GATAAAGCCT 


3900 




TTGAAAAAAA 


GCTTGAAAAA 


TTAGAAGAAA 


AATAACTATA 


ATCATTTTTC 


AAAGTTAAAA 


3960 


15 


ATTTTGAATT 


TATGGTTAAC 


ATGTCAACTT 


ACTATGTGTA 


TAATGGTAAA 


CATTGATATT 


4020 




AACTATATGT 


ATAAAAATGT 


CACGCAGATG 


CTATTTAAAT 


GTGATAAATA 


TTTTTAGAGG 


4080 




TGAATAGAGT 


GGCTATAAAG 


CTAAGTTCAA 


TTGACCAATT 


TGAACAGGTT 


ATTGAGGAAA 


4140 


20 


ATAAATATGT 


TTTTGTATTA 


AAACATAGTG 


AAACTTGTCC 


AATATCGGCA 


AATGCGTACG 


4200 




ATCAATTTAA 


TAAATTTTTA 


TATGAACGCG 


ATATGGACGG 


TTATTATTTG 


ATTGTCCAAC 


4260 




AAGAACGCGA 


TTTGTCAGAT 


TATATTGCTA 


AAAAAACGAA 


CGTTAAACAT 


GAATCACCTC 


4320 


25 


AAG CATTTT A 


TTTTGTAAAT 


GGTGAAATGG 


TTTGGAATCG 


AGACCACGGT 


GATATCAATG 


4380 




TGTCGTCATT 


AGCACAAGCA 


GAAGAATAAT 


GAAACTATAG 


GGTTGGAACA 


TTTTGCCTTA 


4440 




CACTACTAGA 


CGTGAATAGC 


ACAACTTAAA 


TTCGTGTGAA 


TCAGAGTAGT 


TTGGCTATAA 


4500 


30 


TGATGTTCTG 


ACCTTTTATT 


TTATGTCACC 


TTTAGAAGCA 


GTTAAGTTAG 


TACTTTTTTA 


4560 




CAAACATATG 


TATAATATAT 


TCGAGTATTT 


TTATTGAAAa 


tATTTTGGAA 


AACGACGAAT 


4620 


35 


CCAATAAGAA 


AATTTAAACA 


TGATTTGTAA 


GTTAGTTTAA 


TAGGAAATAT 


ATGCTAAACC 


4680 


AAAAGAAGCA 


TATTGTTATT 


TACTGGAATA 


ATTAATAATC 


ATGTCATGTT 


AAATGTTAGC 


4740 




ATATAATCAC 


GAGATAAAAT 


CTAAAATTTA 


AGATTAATCT 


TTTATGAATA 


AAAAACGTAT 


4800 


40 


CACAACAAAT 


AATAAAGTAA 


GGTGGTCAAG 


GTTATGAAAG 


TATTAGTAGC 


CATGGATGAG 


4860 




TTTCATGGAA 


TTATTTCAAG 


TTATCAAGCT 


AATAGATATG 


TTGAAGAGGC 


AGTTGCAAGC 


4920 




CAAATTGAAA 


CTGCAGATGT 


AGTTCAAGTA 


CCATTGTTTA 


ATGGAAGACA 


TGAATTATTA 


4980 


45 


GATTCTGTAT 


TTTTATGGcm 


ATCTGGGcaA 


AAGTATCGTA 


TACCAGTACA 


TGATGCAGAT 


5040 




ATGAATGAAG 


TTGAAGGTGT 


TTACGGACAA 


ACTGATACAG 


GGATGACCGT 


TATCGAGGGG 


5100 




AATTTATTTT 


TAAAAGGTAA 


AAAACCAATT 


GTTGAACGAA 


CAAGTTATGG 


TTTAGGAGAA 


5160 


50 


ATGATTAAAC 


ATGCATTAGA 


TAACGACGCA 


AAACATGTTG 


TAATTT CACT 


AGGTGGGATT 


5220 




GATAGTTTTG 


ATGCTGGTGC 


AGGTATGTTA 


CAAGCATTAG 


GTGCTCAATT 


CTATGATGAC 


528C 
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GATATGTCGA ACTTACACCC TAAAATGGAA 
TCAAGTCGAT TATATGGTAA GCAAAGTGAA 
5 AATCATAATC AAG CAGCAG A AATCGATAAT 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT 
GCAGTCTTGA ATGGACTGTA TCAAGCTGAA 

w 

CTAACACATT TAGAAAATTT AGTTGAACAA 
AATGAAAATG ATCAGTTGCT AGAAACGACA 
CATCAAAAGG TTGCCATTGC AATTTGTGCA 

15 

CAAGGGGTTA CAGCAATGTT TAATACATTT 
AAAATGGGtT ACAAATTAGG CATTATACGG 
TTAATGTTGA GGTTTAGTAA AGAAGGACTA 
CATTTATGAT GGTTAGCAAA ACGAATTAGA 
AATCACGCTA TCATTGCACT GAATGTTAGC 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT 

TGATTACTTG TACTGCTTTT TCCaTAACAT 
AGTTACCGCA ACCTGTAAAG ATGTTTGGAG 

30 CATCTGTACC ACCGCGAATA GGTTCAGTGT 

GTTTAGGTAT ATCAATAATA TGAGGCAATG 
GATCCGATAT ATCAACTTTA ACTGGATAAT 

35 

TTTCTAAAAT ACGTTTCTTA CGCAATTCGA 
ATTGCAAAGT TGCTTTTTCA ACAGTTC CTT 
ATCCTTCTGT TCGCTCCGGA ACTTCACTAT 

40 

AACGTATTGC GTTTACCATT GCATTTTTAG 
ATGTAATAAC CGCTTCAGCA GCGTTAAAGC 

Jf CATCCATAGT ATAAGCAAAA TCAGCATTGA 

GACCGATTTC TTCGTCTGGT GTAAATCCAA 
GTTCTTGTAA ATAACAAATA GCTTCCATAA 

f CTAGTAACGA TGTACCATCA GTTACCATTA 

CJAAATACTTT AGGATCTAAG ACACGTTTAG 



ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 54 0 0 

ATCATGCAAA CTTATGATGC GCATCAGTTG 54 6 0 

TTAATTTGGT ATTTTAGTGA GTTATTTAAA 5520 

GAACGTGGTG GTGCTGGTGG TGGAATTGCA 55 80 

ATATTAACCA GTCATGCATT AGTAGACCAA 564 0 

GCGGATTTAA TTATTTTTGG AGAAGGATTA 57 0 0 

ACATTGCGTA TTGCAGAACT TTGTCATAAA 5760 

ACTGCTGAAA AGTTTGATTT ATTTGAATCA 5820 

ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

TTCAGTCTTT AAAACTGTTG AAAACACATT 5940 

AATTGGTGAT GCTGTCATGA TGGTTAATAA 600 0 

AGATCGAAAG TATACGTAAA AAATATGAAA 6060 

GTGATTTTTA TATATTAATT AAGCCTGAGT 612 0 

GATTTTCAGC GATATCTTCT ACAATTCCAA 6180 

CAATGGATGC aTATTCATAT GGGCCGTGGA 6 24 0 

TTGGTAACCC CATAAATGAC AATTGTGAAC 63 00 

TTGCTGGAAT ATCTAATTTG GCAAAGACAC 6 360 

GTAATATTTT TTCTGCCATA TTGAAATATT 64 20 

TTTCAAAATG GGCATTGATA TCGTCACGTA 64 80 

ATTGTTTTTT ATCATGATCA CGAATAATGT 654 0 

CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6600 

CAGGTAGCAA ACTATCGAAT TGTTCACCTA 6 66 0 

CTGAACCAGG ATGAACATTT ACACCGTGGC 6 72 0 

TTTCATATTG TAATTCTCCA TATTGACTAC 6 780 

AGCGGTCAAC ATCAAATTTA TGTGGACCAC 6 84 0 

TGCGAATGGT ACCATGTTTA ATTTCTGGAT 6 90 0 

TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 6 960 

ATGTATGACC AACTAAACTG TTAAGTTCTG 702 0 

TATTGCCTAG TTTGTATGGC TTACCATCAT 7 0 80 



GCGCCAAAAA TCCAACTGTT GGGACGTCGA 
AGTAGCCATT TTCATCTAAA TCAGTTGGCA 
5 AATGTAACAA ATCCCATTGC TTTTCAGTTG 

GCGTATCAAT TGTCGTATAT CTTGTTAATC 
ACCCCTTAAA CTCTATTATT CATGTTGTAA 

10 

CCATACAGTT GTTTGATACG TGTGTATAGG 
AAAGCAATCG CACCTGAAAT CAGTGTAcTT 
CATTTGATAC TAAAAAACGA GTCGCTTGAT 

15 

TGCCTGGCAC TATGAATATA ATTACCGGTC 
TTAAGCCTAA AATTAAGCTT CCCAAAAATG 

20 CCGTTAATTG GTAAATCGTC CATGCAATGG 

GTTTGGGTGC ATTGAAAATG ATAGAGAAAA 
GAAATAAATA AAATAGCATG CTTTAACAGT 

25 GACACCAGCA CCGATTGCGA ATGCTGTTAA 

AAGTAATTCA CCCGCTAATA AATCTCGAAT 
TGGCATGACA CTGGCTATAG TAATGATATC 

30 TGTGGCTGCA ATGGATATGA CCACAGCGGC 

TATATAGCGT tGCACAAAGC TGAATGTTAA 
CCAACAATCT GATGCGACAC CACCAAACAT 

35 

TGCAAAGAAA TTCGTTAAAA AAGAATA7TG 
AGATTTAGCT TCATCAATTG TGAGTTCTTT 
TAAAGCGATT TTCTCTAAAT CTGTTGTACG 

40 

TCGATCGTTT AATGAAAAAA TAATTGCAGT 
ACCATAACTA TGTGCGATAC GGTTCATTGT 

45 TTCaAGTAAA ATTCTACCTG CAATTAATAC 

TGTGATTGAA TCTGGCATAT CAATTCACCT 
TGaAGTTTAC AACTTGTTGT TACAACTTTC 

50 CTTGTATGGT TCAAATTTAA ATAAGAAAAA 

AATAATAGCA AAGGATTAAC AGTTTTGTCG 
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CATCGATGTT ACTTTCTAAT GTAGCAAATA 72 0 0 

ATCCTAATTG TTGTAATTCT TTTTCTAATA 72 6 0 

AAGGTGTTGT TGTAGATTTT GGATCAGATT 732 0 

TATCTATCAA TTGGTTCTTC ATTATATTCG 73 8 0 

GATTTTTTAT ATGTCTTACC TTTGATTTTA 744 0 

TAATATAGAA TTTCAGAAAC TAATATACCG 750 0 

CTAAAAATGT ATTTACAGCA CTTGTATAAT 756 0 

AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

GTTTATATCT GCGACTCATA GTATGACTCA 7680 

AAGCGCCAAC TTTTCCAAAC TCTAAATCTA 7740 

CACCCACAAA TCCACATGCT ACT AAGAGG C 7800 

GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

CCTTCCTTAA ATGATTAATA AAACGATTGC 7920 

TGCAGCTTCA ACACCGCGAG ACATACCTGC 798 0 

GGCATTGGTA ATTAATATAC CAGGGACAAG 8 04 0 

TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

AAATGCGGAT CCGCCAGCAA TGACTGCAAT 8 22 0 

AAATAGGAAG AAGCCACATG CAATGGCAGC 8 2 80 

TAATGATGCA TGCTGTAAAT GAATAAATTC 834 0 

ATTTGATATT TTACGTGAAA GACTATTCGT 84 00 

CTCTTGTACA CGAATTAATC TTGTACTTGT 84 6 0 

TGAACTGACA AAACTATATG TATTATGAAG 8520 

ATCTTCAACT CGATATGTTT CAGCACCTGA 85 80 

AACATCAATC ACTTTGTTTT CATCTATAAT 8 64 0 

CCAATGATAT GTGTTATTTA TTTGAACAAT 87 0 0 

AATAGTGAGA CTTTGTGTTA GTATGATGAA 376 0 

CTGTTAATCT TTGCTATTAT ACTATGATTT 8 820 

TTGTTATAAA TTGATAATAG GGTTAAACAT 88 80 
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TTTACGCTGT GATTTTGGAT CGTCATCTGT TAAATAACCA 
TTTAATAACT TCTTTGTTTG GTAAATGGAA TGATGATTTT 
5 TTCAGCTAAT TTAACACTTT GATCAAGTGA ATAATTGTGA 

GCCACCATTT CTAAAAATTT TAAATTGATT CGGCACATAG 
TTGTTTTAAT ACAGCATCAC CTGATTTGTG TGAGTAGGTA 

w 

ATCGATATCG ATTAATAATA ATGCGATACT TTGATGTTCT 
TTCATTTAAA TGTCTATCAA ATTCTTTTAC ATTACCTAAG 
ATCTTCGTTT TCATAACGAT TTACGAGTGA GAAGAAATGC 

15 

CGCTGAAGCT AAAGTGATAA TTAATGAAAT TGGTATTAAA 

GTAAATAGGA CTCACTAACG CGACACCAAA TAAAATGATT 
2Q TAATAATGAT AGCACATCAT TTTGTTTTAA AAATGGTCCA 

AATAACAATC AACGTAACAC CGTACATAAT CGAGTTGTTA 

TGCTACAATT ACTGTGGCAG ATAATGTATA GACCATATTT 
25 TAAAGGAACG AATGTTAAGT GAATTAAATA ATCTTCACGA 

TAATAATAAT GATACGATTG TCATTAAAAC AGTGACATAA 

(2) INFORMATION FOR SEQ ID NO: 38: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2343 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

~{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
TCTCAATCAG ATGAAAAATT GCATATCGTA GGTTTTACAG AAAGTGCAAA ATATAATGCG 60 

40 

TCATCAGTCA TTTTCACGAA TGACGCTACC ATTGCCAAGA TCAATCCTAG ATTGACTGGA 12 0 

GATAAAATTA ATGCAGTTGT TGTACGTGAT ACAAATTGGA AAGACAAAAA ATTAAACCAA 18 0 

45 GAGCTTGAAG CGGTAAGTAT TAATGACTTT ATTGAAAATT TACCAGGTTA TAAACCACAG 24 0 

AACTTAACAT TAAACTTTAT GATTTCATTC TTATTTGTCA TTTCAGCTAC AGTTATAGGC 3 00 

ATTTTCCTAT ATGTCATGAC ATTACAAAAG ACGAGTTTAT TTGGCATATT AAAAGCTCAA 360 

50 GGATTTACGA ATGGCTATTT GGCGAATGTG GTAATTTCGC AGACGGTCAT ATTAGCACTA 420 
TTTGGTACGG CATTTGGCTT ACTGTTAACA GGCGTTACAG GTGCATTTTT ACCTGATGCA 480 



ACACCGATAG ACACTGACAA 9 000 

TCAACACCCG AACGAATATT 9060 

ATGACAACTG AGAACTCTTC 912 0 

TTTTTAAGTA ATTGAGACAT 9180 

TCATTGaCAT CTTTAAATCC 924 0 

TTTTCAGCTT TTCGTGAAAT 9300 

CCTGTTAAGT AATCATATTT 93 60 

CAAATATCGA CAAATGTTAT 94 20 

ATGATAACTT CCGATAGTGT 94 80 

ATTGTAACAA CATTAAGTAT 954 0 

ATAGCACTTG TTACTGCAGC 9600 

AATACTACAA TTTCAACAAT 9660 

GTAAATCTAC CTAAAAACAA 9720 

TAAGGGATAG GGTAGACAGA 9780 

GCCTTAGAAA AAAC 9834 
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TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 6 00 

ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 66 0 

GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 72 0 

ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 78 0 

TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 84 0 

CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AG CT ACAAAC 900 

CTTGTACCAT TTTTAACGGT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 960 

GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 1020 

AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 1080 

TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCGTTA GATACTGAAA 1140 

ATGCGATTGA AGTCATTAAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 1200 

TTATTGTTAC ACATGATGAA CGACTTAAAG CATATTGTGA TCGTTCATAT CATATGAAAG 1260 

ATGGCGTCCT TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 132 0 

25 GTGCCGGTAT TTTTATGTTT ATGTATTATT TGAATAAACT TTCACATTCA ATTAATAATA 13 8 0 

ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 1440 

TCTAAGTATT CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 1500 

TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 1560 

TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATT AG CT AATGCAGCAA 1620 

TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 1680 

AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCGCT GCCGCTGGAT 174 0 

TAACAGCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 1800 

GTGGTATGGA TTTATTACCA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT 1860 

ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 1920 

CATTAGCTCG T CTTATTGCA GTATTATTAA CGCCAGTAGT GAATAGTACA TTGATTCGAA 198 0 

TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 204 0 

GTGGTATTAT TACGGTTGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 210 0 

TAGGTTTAAC GGGTGTACCT ATGGCTATTG GTGCCATGGC AGCATTTAGT TCGGCATTTA 2160 

50 TGAATGGGAC GCTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 2220 

GTATTGAACC TTTATCACAA GCAGATATTG TATCAGCCAA TCCAATTCCA AT CT AT ATT A 228 0 
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ATGCGACAGG TACAGCTACA CCGATTGCAG 
CGACGACAAT TGTGATTTAT GGTGTAGTAA 

5 TTGGTTCAAT TGTATTTAAA AAATATCCAA 

GTGCAGTAGA CG CAT AG CAT CAT CAT A IT G 
TGATTCAGTC GATGTAACAG TCGATAATGA 

10 ATAAAAATGT CATAATTTAT TGTCGACAAA 

TCAAGTACCT TTTACACGAT GGAATGAACT 
AACAAATATC ATTGATATAA CGGTAAATGT 

15 

ACGATATTTT TGTAAATTCA CTGATTCAAG 
TGAGCGATTC TGAGAAAGAA ATTTTAAAAA 
GTGAACTTGC TGAGGCAATT GGATTATCTA 

20 

TAATACAAAA GGAATATGTT ATGGGAAAGG 
TTTGTATTGG CGCAGCGAAT GTAGATCGTA 

os AAACATCAAA TCCTGTAACG TCAACACGCT 

AGAACTTAGG TAGGCTTGGC GAAACGGTCG 
AATGGGAAAT GATTAAACGA TTGTCCACAC 

30 TTGAAAATGC GAGTACAGGT TCATATACAG 

ATGGCTTaGC AGATATGGAA GTGTTTGACT 
CACACTTATT GAAAAAGGCT AAGTGCATTA 

35 TAAACTTCTT ATGTGCCTAT ACCACGAAAC 

CTTCCCCAAA AATGAAAAAT ATGCCTGATT 
ATAAAGATGA AACAGAAACA TACTTAAATT 

40 

TAGCTGCTAA ACGCTGGAAT GATTTAGGTG 

AAGAACTCAT TTATCGAAGT GGTGAGGAAG 

GTGTGAAAGA TGTTACAGGT GCAGGCGATT 

45 

TAAATGGGAT GTCTACTGAA GATATATTAA 

TAGAAACGAA ATATACAGTT AGGCAAAACC 

AGGATTATAA AAATGGCAAA TTTACAAAAG 

50 

GCACGGGAGA ACAATCAACC GATTGTAGCA 
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GATTTTTAGT TATGTTTGGA TTTAATCATC 24 0 0 

TGGCGATTGT AGGTGCGCTT GCAGGTTATC 24 6 0 

TTGTTACTAA GCAAGACATG ATTAATCGAG 252 0 

AATAGTAAAA ACAAATAAAA CATAGTAACG 25 8 0 

GTCACGTTTT TTTATAGAAA AATACAAGAC 264 0 

TATCATACTG TATAAACATT TATCATTTTC 27 0 0 

TACTTTTTAC GAAATTATGC GTATTTTATA 276 0 

AAGCGTTTAC AACAGAAATA ACAGCATGCT 2820 

TATTTTAAGT CAATATGAGG AGGGATGTTA 2880 

GAATTAAAGA TAATCCGTTT ATTTCACAAC 2 940 

GACCCAGCGT AGCAAACATT ATTTCAGGAT 3000 

CATATGTTTT AAATGAAGAT TATCCTATTG 3060 

AGTTTTATGT GCATAAAAAT TTAGTTGCAG 3120 

CTATTGGTGG CGTAgCAAGA AATATTGCTG 3180 

CTTTTTTATC TGCTAGTGGA CAAGATAGTG 324 0 

CATTTATGAA TTTGGATCAT GTTCAACAAT 33 00 

CTTTAATTAG TAAAGAAGGC GACATGACAT 336 0 

ACATTACGCC TGAATTTTTA ATTAAGCGTT 34 2 0 

TTGTAGATTT GAATTTAGGC AAAGAGGCAT 34 8 0 

ATCAAATCAA ATTAGTTATC ACCACGGTTT 354 0 

CATTACATGC TATTGATTGG ATTATCACGA 36 00 

TAAAAATAGA ATCTACTGAT GATTTAAAAA 3 66 0 

TTAAAAATGT TATTGTGACA AATGGCGTGA 3 72 0 

AAATCATTAA GTCAGTTATG CCATCAAATA 3 7 80 

CATTCTGTGC TGCAGTAGTG TATAGCTGGT 3 84 0 

TTGCTGGTAT GGTTAACGCA AAGAAAACGA 3 90 0 

TAGATCAACA GCAACTTTAT CACGATATGG 3 96 0 

TATATTGAGT ATTCTCGAGA AGTTCAGCAA 4 02 0 

TTAGAATCAA CAATTATTTC GCATGGTATG 4 0 80 
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GCCATTCCAG 


CAACCATAGC 


CATTATAGAT 


GGCAAAATTA 


AAATTGGTTT 


AGAAAGCGAA 


4200 




GATTTAGAAA 


TACTGGCAAC 


TAGTAAAGAC 


GTTGCTAAAG 


TATCTAGAAG 


GGATTTAGCA 


4260 


5 


GAAGTTATTG 


CGATGAAGTG 


TGTTGGTGCT 


ACTACTGTAG 


CGACGACGAT 


GATATGTGCT 


4320 




GCAATGGCTG 


GTATTCAATT 


TTTTGTTACA 


GGAGGTATTG 


GGGGCGTCCA 


TAAAGGTGCA 


4380 


10 


GAACATACGA 


TGGACATTTC 


AGCAGACTTA 


GAAGAACTGT 


CTAAAACAAA 


TGTCACTGTT 


4440 


ATCTGTGCAG 


GTGCCAAATC 


AATTTTAGAC 


TTACCTAAGA 


CGATGGAGTA 


TTTAGAAACA 


4500 




AAAGGCGTTC 


CAGTTATTGG 


ATATCAAACG 


AATGAATTGC 


CAGCATTCTT 


CACTCGCGAA 


4560 


15 


AGCGGTGTTA 


AGTTAACAAG 


TTCGGTTGAA 


ACGCCAGAAC 


GACTTGCTGA 


CATTCATTTA 


4620 




ACAAAACAGC 


AGTTAAATCT 


TGAAGGTGGC 


ATTGTTGTTG 


CTAATCCAAT 


TCCATATGAG 


4680 




CATGCCTTAT 


CAAAAGCATA 


TATTGAGGCA 


ATCATAAATG 


AAGCTGTTGT 


TGAAGCGGAA 


4740 


20 


AATCAAGGTA 


TTAAAGGTAA 


GGACGCCACA 


CCGTTCTTGT 


TAGGGAAAAT 


TGTAGAAAAA 


4800 




ACGAATGGTA 


AAAGTTTAGC 


AGCAAATATA 


AAACTTGTTG 


AAAACAATGC 


GGCGTTGGGT 


4860 




GCTAAAATTG 


CTGTCGCTGT 


TAATAAATTA 


TTGTAGGTGA 


TGATACATGA 


ATATTTTATT 


4920 


25 


CGCTATCACA 


GGGATAGCAT 


TTGCACTATT 


TGTTGCGTTT 


TTATTCAGTT 


TTGATCGTAA 


4980 




AAAAATAGAC 


TTCAAAAAGA 


CGTTAATAAT 


GATATTTATT 


CAAGTGTTGA 


TCGTGTTATT 


5040 




TATGATGAAC 


ACAACGATTG 


GTTTGACAAT 


TTTAACTGCA 


CTAGGTTCAT 


■rrriTGAAGG 


5100 


30 


GCTAATAAAT 


ATTAGTAAAG 


CAGGCATAAA 


TTTTGTTTTT 


GGAGATATAC 


AAAATAAAAA 


5160 




TGGCTTTACG 


TTCTTTTTAA 


ACGTATTACT 


GCCATTAGTT 


TTTATTTCTG 


TATTAATAGG 


5220 


35 


CATCTTTAAT 


TATATTAAGG 


TATTACCATT 


TATTATCAAA 


TATGTAGGTA 


TCGCTATTAA 


5280 


TAAAATAACT 


AGAATGGGGC 


GCTTAGAAAG 


TTATTTTGCT 


ATTTCAACAG 


CAATGTTTGG 


5340 




GCAACCAGAA 


GTATATTTAA 


CAATAAAAGA 


TATTATTCCA 


AGATTATCTA 


GAGCGAAATT 


5400 


40 


ATATACAATT 


GCGACGTCTG 


GTATGAGTGC 


TGTTAGTATG 


GCAATGCTAG 


GTTCATATAT 


5460 




GCAGATGATT 


GAACCCAAGT 


TCGTAGTTAC 


AGCAGTAATG 


TTAAATATTT 


TTAGTGCGCT 


5520 




TATCATCGCC 


AGTGTAATCA 


ATCCCTATAA 


ATCTGATGAT 


ACTGATGTTG 


AAATTGATAA 


5580 


45 


CTTAACGAAA 


TCCACAGAAA 


CTAAAACATT 


GAATGGAAAA 


ACAGGAAAAC 


CTAAGAAAGT 


5640 




TGCCTTTTTC 


CAAATGATTG 


GTGATAGTGC 


GATGGATGGG 


TTTAAAATCG 


CTGTTGTAGT 


5700 




AGCCGTAATG 


TTGTTAGCAT 


TTATTTCATT 


AATGGAAGCA 


ATTAATATCA 


TGTTTGGTAG 


5760 


50 


TGTTGGTTTG 


AACTTTAAAC 


AGCTTATTGG 


CTATGTGTTT 


GCACCAATCG 


CATTCTTAAT 


5820 




GGGGATTCCA 


TGGAGCGAAC 


TGTTCCAGCT 


GGCTCTTTAA 


TGGCGACTAA 


ATTAATTACA 


5880 
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CAAGGTATCA 


TTTCAGTTTA 


CTTAGTAAGc 


TTCGCTAATT 


TTGGTACGGT 


TGG TAT CATC 


6000 




GTAGGTTCAA 


TTAAAGGCAT 


TAGTGATAAA 


CAAGGAGAAA 


AAGTTGCATC 


CTTTGCAATG 


6060 


5 


AGGTTGCTAC 


TTGGTTCAAC 


TCTAGCTTCA 


ATCATTTCAG 


GATCAATCAT 


TGG CTTAGTA 


6120 




TTGTAAATGA 


ATCGAAGTAC 


CTAAATTAAA 


TTCATGGCAA 


AGCTAAACCC 


CGTCACCAAG 


6180 


W 


TTGGCGCAAC 


AGCGcATgcA 


TAACTTAGTG 


ACGGGGTTTT 


ATCATAACAA 


TCTACTTTTT 


6240 


CGTAGCCGTT 


TTTGAAATGT 


ATGTTGATGG 


TTTATCTTTT 


TCAAAAATTG 


TTAATCCCGT 


6300 




TATATCTTTT 


TTATGTTTTG 


AAGGGACAAT 


GAAGCTAAGT 


ATATAAGCAA 


AGACAAAAGC 


6360 


15 


AACTGTAAAT 


GAAATGGTAG 


ATACATAGAA 


AGGTGAGTTA 


CCTTTGCCAA 


CACCATTATA 


6420 




GACATAAGCA 


AAGATGATAC 


CCAATATTAA 


TCCACAAATA 


ACACCGAATG 


TATTCGTACG 


6480 




TTTAGTGAAA 


ATACCAACTG 


CAAATACACC 


AGCCAATGGA 


ACGCCGAATA 


ATCCAGTCAC 


6540 


20 


AAACAAGAAT 


AAATCCCATA 


AGTCATTTGA 


ATTAGAAGCA 


ATTAAGTATA 


GTGACATTCC 


6600 




AAAACCGAAA 


ATACCTGCAA 


TGATAATAAT 


GAAACGTGCA AAGTTAACTT 


CGTGTCGCTC 


6660 




GCTACCTTTT 


CCGAAGAAGC 


GTTGCTTAAT 


GTCGATTGAA 


ATACAAGCAG 


ATATAGAATT 


6720 


25 


TAAACTAGAT 


GAAATGGTAG 


ACTGTGCAGC 


GGCGAAAATG 


GCTGCAATAA 


GTAATCCTGC 


6780 




TACAAATGGT 


GG CAT CTCAG 


TCAAAATGAA 


ATATGGCACT 


ACAGATGATG 


TATTGAAGCC 


6B40 




TTTTGGTAAA 


ACAGCTTCAT 


GTGTATAAAA 


TGAATACAGC 


ATTGTACCCA 


TACCATAAAA 


6900 


30 


TAAGGGTGCT 


GAAATTAAAG 


CTAGGATACC 


ATTTGTCCAT 


AACGATTTAT 


TTGTTTCTTT 


6960 




TAAACTATCA 


GAAGCTTGAT 


AACGCTGCAC 


GACGTCTTGA 


CTCGCTGTGT 


ATTGATACAA 


7020 


35 


GTTGTTGAAA 


ATATTTCCTA 


GGAAAATAAT 


TGGAATGGCA 


GCTGCCGCAG 


TATTTAGTTT 


7080 


CCAATTGTCT 


GCACTAATTA 


ATTTTTTGTG 


CTCAATCGCA 


TCTGCAAAGA 


CAGTGCCGAA 


7140 




ACCGtCTTTA ATGTTCACAA 


CACCTAGAAT 


AATAATAACT 


AAAGCGCCGC 


CTAATAAAAT 


7200 


40 


GACGCCTTGA 


ATGAAATCAC 


TCCAAACCAC 


ACCTTCGAAA 


CCACCTAAAA 


ATGTATATAA 


7260 




AATACATAGT 


AAACCAACGA 


GTGATGCAAC 


GATATAAGGG 


TTCATGTCTG 


ATACAGATGT 


7320 




GATTGCTAAT 


GTTGGTAAGT 


AGATAACAAT 


TGCAACACGC 


CCTAAATGGT 


AAACGACAAA 


7380 


45 


TAATAATGAG 


CCAATGACAC 


GTATGCTAGG 


GCCAAATCTA 


GCTTCTAAAT 


ATTCATATGC 


7440 




AGATGTTACC 


TTTAACTTTT 


TAAAGAAAGG 


GACATAGAAA 


TAAATAAGTA 


ATGGAATAAT 


7500 




TGCGACG ATA 


GCAATGTTAC 


CAGCGATATA 


TGACCAATCT 


GTTAAAAATG 


CTTTCTCTGG 


7560 


50 


TGTCGACATA 


AATGTAATCG 


CACTTAACGT 


AGTAGCATAA 


ATTGAAAAGC 


CAACTACCCA 


7620 






^ACrACTTC; 


rGGTAAAGAA 


ACTATTGGTA 


CTTTGGCTCG 


CGCGCTTGGT 


7680 
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TGTGCCAAAT CCAACTTCTT TCATGGGCAA CATCCCCTTT ACAATGTATT GATTCTTTGA 7 800 

TGTCTATAAA TCGTATTTTG CAATGAGTTG ATCTAATGTT TGTCGATGTG CTTCGTTAAA 7 860 

AGGTTTGAAA GGTCTTTTCG GTAATCCTGC ATCAATGCCA CGATGACGTA ATATTT CTTT 7 92 0 

CAATGTTGGA TAAATCCCCA TTGATAACAC TGTTTCGATA ATGTCGTTTG AATCATGTTG 7 98 0 

CAGTTGGTAA GCTTCTTGAA TTTGACCTTG TCGTGCTAAG TCGAAGATTT TTCTTGCACG 8 04 0 

GCGACCATTA ACGTTATATG TAGAACCAAT TGCACCATCT ACGCCAGAAA TCGTAGCTTG 8100 

AACTAACATT TCATCAAAGC CAGATAAGAT TAATTTGTCT GGGAATGCTT TTCTAATACG 816 0 

TTCGAGTAGG AAGAAGTTTG GCGCTGTATA TTTAACACCA ACAATTTTTT CATGATTAAA 8220 

TAGCTCGCTG AATTGTTCAA TAGAAATATT CACACCTGTT AAATCTGGTA TTGCATAAAT 82 8 0 

AATCATATTG TTCTGAGTTG CTTCGATAAT ATCGAAATAG TAATCTCTAA TTTCTTCAAA 8340 

AGTAAATGGA TAGTAGAATG GTGTTACGGC AGAAAGTGCA TCATAACCGA GTTCTGTGGC 84 00 

ATATTTTCCA AGTTCAATGG CTTCATTTAA ATCTAACGAA CCTACTTGAG CAATCAATTT 84 60 

CACTTTATCC CCAACTGCCT CTTTGGCAAC CTTGAAAACT TGCTTCTTCT GCTCTGTATT 8520 

TAATAAAAAG TTTTCGCCTG AGCTACCATT TACATAAAGA CCGTCTAATT CTTCAGTTTC 8580 

AATGGCATTT TGAGCAATTT GTTTAAGTCC TTGTTCATTT ACTTGACCAT TTTCATCAAA 8 64 0 

AGGAACGAGT AACGCTGCAT ATAAACCTTT TAAATCTTTG TTCATTATGA AGTCCCTCCA 8700 

AAAATCATTT GATAATATAG TTTACAGCTA TAATTGTAAA CGCTATCATA AAATGTAACA 876 0 

ATATCTTTTT GAAAATTGTA GTCATATTTA TGTATAATTA ATGAAAATGT TTTTCAAAAT 8820 

CAATAGAAAT GGAGTGAGTA AGGTGTATTA CATCGCAATC GATATTGGAG GCACTCAAAT 8880 

TAAATCGGCA GTTATTGATA AGCAATTGAA TATGTTTGAC TATCAACAAA TATCAACGCC 894 0 

GGACAACAAA AGTGAGCTTA TTACTGACAA AGTATATGAG ATTGTAACAG GATATATGAA 90 0 0 

GCAATATCAG TTGATCCAAC CTGTCATAGG TATTTCATCA GCAGGCGTTG TTGATGAACA 906 0 

AAAAGGCGAA ATTGTATACG CAGGGCCAAC CATTCCGAAT TATAAAGGTA CTAATTTTAA 912 0 

GCGATTATTA AAATCACTGT CTCCTTATGT CAAAGTAAAA AATGATGTAA ACGCTGCATT 918 0 

ACTAGGCGAA TTGAAATTAC ATCAATATCA AGCAGAACGG ATCTTTTGTA TGACGCTTGG 924 0 

TACAGGCATT GGGGGTGCGT ACAAGAATAA TCAAGGTCAT ATTGATAATG GTGAGCTTCA 93 0 0 

TAAGGCAAAT GAAGTTGGGT ATTTATTGTA TCGTCCAACT GAAAATACAA CGTTTGAGCA 93 6 0 

ACGTGCTGCA ACGAGTGCAT TGAAAAAGCG CATGATTGCC GGAGGATTTA CGAGAAGCAC 94 2 0 

ACATGTGCCA GTATTGTTTG AAGCAGCTGA AGAAGGTGAT GATATTGCAA AACAAATATT 94 8 0 
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AGGGCTTATA TTAATTGGGG GCGGTATATC TGAACAAGGA GATAATCTCA TTAAATATAT 96 00 

CGAGCCGAAA GTTGCACACT ATTTACCAAA AGACTATGTT TATGCACCAA TACAAACGAC 9660 

5 

TAAGAGTAAA AATGATGCAG CATTATATGG CTGTTTGCAA TGATAGTTGA AAGAAGGAGT 9720 

CATTCTAAAA TAGAATTTGA AACCGTTACG AGAGATGAGA GCTGTTGTTA GTTCCACACA 9780 

TCACACTCTA TCTAGGACCA ATCTAAACTA TATCAACCAA CAGTGTGCCA CGGGCAAATT 984 0 

w 

AAATTGAAGA AGCTGAGATA TTAAAATTTT AGAAAATGTA AAAAAATATT TGGTATTGAA 9900 

ATTAAAAAAG CACCTAGCAA CTCGTTGGGA CAATCACGAT GATTGTCTAC AGTTGCAGGT 9960 

GGATTTGAAT ATACTACTAG TTATTTGTTG TCTAGGATAA TAGATTTAGT ATGTTGATAA 10020 

GTTTGACTCA GATTCGTATT TTCTAATAAA TGATAACTCA CGATATCGAT TAAAAAGAGT 100 80 

GTCGCAATTT GTGTGTTGAT AAATTGATGG TCGGTATTAC GCGATTGATC CGTTGTTAAA 10140 

20 AGTACTAAAT CTGCACAATC TGTAAGTTTA CTACCTTCAA AATTTGTGAT GGCAACGACA 10200 

TATGCACCAT GAGATTTGGC GACTTCCGCT GCAGAAATTA ATTCCGAAGT ATTACCACTA 10260 

TTTGACATAG CAATAAACAT ATCCGAATGA GATAGTAGGG ATGCCGATAT TTTCATTAAA 10320 

25 TGTGAATCGG TAGTAACATT ACCTTTTAGC CCCATACGAA TCATACGATA ATAAAATTCA 10380 

GTCGCTGATA AACCAGAGCT ACCTAGTCCA GCAAAGAGTA TATGTCGACT TGATTGAAGT 10440 

TTGTCGATAA AGGTTTGGAT AATGTCGTTA TCAATAAATT CACCAGTTTG TTGAATGATT 10500 

30 TGTTGATGAT ATTTATGAAT TCTTTGAATA ATTGGGCTAT TTTCAATAAC TGTCTCTGTC 10560 

ATTTCTTGTT GAATATTAAA TTTTAAATCT TGGAAATTCT CATAATCCAG CTTATGACTA 10620 

AAGCGTGTCA TCGTTGCTGG TGATGTACCA ATCGCATGGG CTAAGGAGTT AATCGTTGAA 1068 0 

35 

AAGGCATCGC TATAACCATT TTGTCTTATA TAATTGACGA TGCGTTTATC AGTTTTTGTA 10740 

AATAAATGTT GATAACGTTG AACACGATTC TCAAATTTCA TTGTGTCACC CCTTCATCTT 108 00 

AATGATTACT ATTATATATG AAAAATATTT TCAAGATAGT AAAAAGCATT GATAAAAATT 10 86 0 

40 

ATCTTAATGA TATATTGTAA ATGACTTTAC GTGAAAAAAC GACTTATGGA GTGAGGAATA 10 920 

ATGTTACCAC ATGGATTAAT AGTATCTTGT CAGGCACTAC CAGATGAACC ATTGCATTCA 10 980 

4 5 TCTTTTATTA TGTCGAAAAT GGCATTAGCT GCGTATGAAG GTGGTGCTGT TGGTATTCGC 11040 

GCAAATACTA AGGAAGACAT TTTAGCAATT AAAGAAACGG TAGATTTACC AGTTATTGGC 1110 0 

ATTGTGAAAC GTGACTATGA TCACTCAGAT GTTTTCATTA CTGCAACGTC AAAAGAAGTT 1116 0 

50 GATGAACTGA TAGAAAGCCA ATGTGAAGTC ATTGCATTGG ATGCAACGTT ACAGCAACGT 112 2 0 

— AAA^AAA rCTTAGACGA ATTAGTATCA TATATTAGAA CACATGCACC GAACGTTGAA 112 30 
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TATATTGGCA 


CGACGTTACA 


TGGCTATACT 


AGTTATACGC 


AAGGACAATT 


ACTTTATCAA 


11400 


AATGACTTCC 


AATTTTTAAA 


AGATGTACTA 


CAAAGTGTTG 


ATGCAAAAGT 


TATTGCGGAA 


11460 


GGTAATGTCA 


TTACACCGGA 


TATGTATAAA 


CGTGTGATGG 


ACTTAGGCGT 


TCATTGTTCA 


11520 


GTCGTTGGTG 


GTGCGATAAC 


ACGACCAAAA 


GAAATTACGA 


AACGTTTTGT 


TCAAATTATG 


11580 


GAAGATTAAA 


TGATAACGAT 


AAAAAAACGA 


GATGACCATC 


ATTAATTAAA 


GGCACCTAAT 


11640 


TATCTTAGGT 


GGCTGAATGA 


ATGTAATGGG 


TTCATCTCGT 


TTTGTTTGTT 


TATGATAGTG 


11700 


ATTTTATTTT 


CAACTTTATC 


CAAAAATAAG 


TAAAGCGACG 


GGGATGGTGA 


TTAATAGCGA 


11760 


CAACGCCACG 


CGTAAAAACC 


AAATGATGAT 


GAGTTTCCAG 


ACAGGTATTT 


TAATTTCAGT 


11820 


TGCTAGTATA 


CATGGCACTA 


ATGCTGAGAA 


AAAGATAATG 


GCTGATACGC 


TTACTACACC 


11880 


GACGACAAAT 


TTAGTACTCA 


TTGCAGCTTT 


AGTTACTAAC 


AAAGATGGTA 


GAAACATCTC 


11940 


TACAATAGAA 


AckCTGACGC 


TTTTGCTAGT 


AAAGCCTGAT 


CAGCAATTGG 


GAAAATATAA 


12000 


ATAAATGGAT 


AGAAGATATA 


GCCAAGCCAA 


TCAATGAATG 


GTGTATAGTT 


CGCTACAATC 


12060 


AGTCCTAAAA 


AACCAATCGA 


TAATATAGAA 


GGTAAAATAC 


CAACAGTCAT 


TTCTAAACCG 


12120 


TCTTTCAAAT 


TGTCCCAAAC 


GTTCTTCACG 


AGAGATGGTG 


TTAATGCATT 


TTGTTTCATC 


12180 


GCCTCTGCAT 


ATGCAGTTTT 


CAGTCTGCTT 


CCTTCAATAG 


CAACTTCTTG 


TTCTCCTTCT 


12240 


TGTCCGTTAT 


AATATTCTGT 


TGATTCATTG 


CTGATTGGCG 


GTAGCCATGC 


AGTAATTGCA 


12300 


GTCACGACAA 


ATGTGATGAC 


TAAAGTTATC 


CAAAAGTATA 


AATTC CAATG 


CGGCATTAAT 


12360 


CCTAAAGTTT 


TAGCAACGAT 


AATCATAAAA 


GTTGCTGAAA 


CTGTTGAAAA 


GCCAGTCGCA 


12420 


ATAATCGTGG 


CTTCTCGTTT 


GTTGTACATC 


CCTTGCTTAT 


AGACACGATT 


AGTAAT CAAT 


12480 


AATCCTAAGG 


AATAACTGCC 


GACAAACGAA 


GCCACTGCAT 


CGACAGCGGA 


TTTTCCTGGT 


12540 


GTTTTAAAAA 


TAGGTCTCAT 


AATAGGCTCC 


ATATAAACAC 


CGACAAATTC 


TAATAAGCCA 


12600 


TAGCCCACTA 


ATAAAGAAAG 


CGcAATTGCA 


CCTACTGGAA 


TTAAGATACT 


TAATGGCATC 


12660 


ATTAATTTTT 


CAAACAAAAA 


CGGACCATAG 


TTAGCTTTAA 


ATAGTATTGA 


TGGACCGATT 


12720 


TTAAATACAT 


ACATTATACC 


GATCATTGCA 


CCTGCAACTT 


TAAATAATGT 


AATGACCAAG 


127B0 


TTTGTGATTG 


AAGTCATAAA 


AGTACGTCTC 


ACTATTGGTA 


ACGCTGTACC 


AATTAAAATC 


12840 


ATAATCAGTG 


CAACATAGGG 


CATAAGTGGA 


CCTATGATTG 


AGCGAATGGC 


TAGATGAACA 


12900 


TGATCGACGA 


AAATAGTGTT 


GTTACCATTA 


ATCGTAAAAG 


GAATAAAGAA 


ACATAGTATG 


12960 


CCCACTAAAC 


TATAGACAAA 


AAAACGCCAT 


GCACTTGGTT 


GTTGTGCATT 


AGAATGATAT 


13020 


TGATTCATTA 


AAGCAACCCC 


TTTGTTTAAA 


TGAATACACA 


AAACTGTATG 


ATGCATCTTC 


13080 
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15 



20 



ATAGTTTGAA TTATTTTCAT AC CAATACAA ATTAACTAAT TATATATAGA TTGAAACTAT 13 200 

ATTACTTAAT AAAATATTTA TCTTAAATGT TGTTGTGTTG ATTCAACACC ACAACTAAAA 13260 

GTGTTTATAA ATTATTTGGA AATACACATA TTTGTAAATG ATTAGTATCG ATTTAATATC 13320 

GTATTATTAA ATTTTTATTA ATTTTGTAGT CTTAATCmAA AAATAATATA TGTCATGTTA 13 3 80 

TATTGAAGGT GCAGTTGTTT TTCATTCTCA AGAGGGGGTC AAAAAAATAC TTTTGAGGTG 13440 

ATTATATGTT AAGAGGACAA GAAGAAAGAA AGTATAGTAT TAGAAAGTAT TCAATAGGCG 13 500 

TGGTGTCAGT GTTAGCGGCT ACAATGTTTG TTGTGTCATC ACATGAAGCA CAAGCCTCGG 13560 

AAAAAACATC AACTAATGCA GCGGCACAAA AAGAAACACT AAATCAACCG GGAGAACAAG 13620 

GGAATGCGAT AACGTCACAT CAAATGCAGT CAGGAAAGCA ATTAGACGAT ATGCATAAAG 13 580 

AGAATGGTAA AAGTGGAACA GTGACAGAAG GTAAAGATAC GCTTCAATCA TCGAAGCATC 13 740 

AATCAACACA AAATAGTAAA ACAATCAGAA CGCAAAATGA TAATCAAGTA AAGCAAGATT 13 800 

CTGAACGACA AGGTTCTAAA CAGTCACACC AAAATAATGC GACTAATAAT ACTGAACGTC 13860 

AAAATGATCA GGTTCAAAAT ACCCATCATG CTGAACGTAA TGGATCACAA TCGACAACGT 13920 

25 CACAAT CG AA TGATGTTGAT AAATCACAAC CATCCATTCC GGCACAAAAG GTAATACCCA 13 980 

ATCATGATAA AGCAGCACCA ACTTCAACTA CACCCCCGTC TAATGATAAA ACTGCACCTA 14 040 

AATCAACAAA AGCACAAGAT GCAACCACGG ACAAACATCC AAATCAACAA GATACACATC 14100 

30 AACCTGCGCA TCAAATCATA GATG CAAAGC AAGATGATAC TGTTCGCCAA AGTGAACAGA 14160 

AACCACAAGT TGGCGATTTA AGTAAACATA TCGATGGTCA AAATTCCCCA GAGAAACCGA 14 220 

CAGATAAAAA TACTGATaAT AAACAACTAA TCAAAGATGC GCTTCAAGCG CCTAAAACAC 14 280 

GTTCGACTAC AAATGCAGCA GCAGATGCTA AAAAGGTT CG ACCACTTAAA GCGAATCAAG 14 34 0 

TACAACCACT TAACAAATAT CCAGTTGTTT TTGTACATGG ATTTTTAGGA TTAGTAGGCG 144 00 

ATAATGCACC TGCTTTATAT CCAAATTATT GGGGTGGAAA TAAATTTAAA GTTATCGAAG 144 60 

AATTGAGAAA GCAAGGCTAT AATGTACATC AAGCAAGTGT AAGTG CATTT GGTAGTAACT 14 520 

ATGATCG CGC TGTAGAACTT TATTATTACA TTAAAGGTGG TCGCGTAGAT TATGGCGCAG 14 580 

CACATGCAGC TAAATACGGA CATGAGCGCT ATGGTAAGAC TTATAAAGGA ATCATGCCTA 14 64 0 

ATTGGGAACC TGGTAAAAAG GTACATCTTG TAGGGCATAG TATGGGTGGT CAAACAATTC 14 7 00 

GTTTAATGGA AGAGTTTTTA AGAAATGGTA ACAAAGAAGA AATTGCCTAT CATAAAGCGC 14 76 0 

ATGGTGGAGA AATATCACCA TTATTCACTG GTGGTCATAA CAATATGGTT GCATCAATCA 14 8 20 

CAACATTAGC AACACCACAT AATGGTTCAC AAG CAGCTGA TAAGTTTGGA AATACAGAAG 14 8 80 
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ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15 00 0 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15 060 

5 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 1512 0 

CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 15180 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

10 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 15300 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 153 60 

15 TCATACAAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

20 GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AGTAGCATCA CAGTGTTGAA 156 00 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

25 ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGCGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 15840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TCATGATATG 15900 

30 

TTCGCTTTCC TCAACGGGAA CATCATAATC GCCATTACAA TGCGCAATGA AAACAGGTGG 1596 0 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16 02 0 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16 08 0 

35 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 1614 0 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 16260 

40 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16 320 

AATCGCATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 16 38 0 

45 ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16 50 0 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16 560 

50 AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16 620 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 1668 0 



55 



EP0 786 519 A2 



CATCATTTTA ACAATATCTT TAAAAGCAGC ATGTGGAATG GCTAAATCTT CTAAATCTGC 16800 

CATAGAAAAT TCAAGATTGA TATCATGTGG TCGCTGTTCA GCAAGTTTAT GCACAAAGTC 16 860 

5 

AGGTTCTGTG ACAAAAGGCG AAGACATGCC GACCATATCT GCATGTTGTA AAGCATCTAA 16 920 

AGCAGACTCT GGAGAATTAA TCCCGCCACT TGCAATTAAA GGGATACGAC CTGCTAAATG 16 980 

TTCATAGACA ATTTGGTTAA CTGGTCGACC GAAATGATCA CCTGGTGTAC GAGACGTATT 17040 

10 

TTGATAAATA TGTCGACCCC AGCTAGCGAT TGCTAAGTAT TGGATGTTTG AAACGTCCAT 17100 

GACCCAATTG ATTAATTGGT TGAACTCGTC AATGGTATAT CCTAAATCAC TGCCTCTGGT 1716 0 

TTCTTCTGGC GTTGCTCGAA ATCCTAAAAT AAAATTGTCA GGTGCTTCTT TATCAATCAC 17220 

TTCTTGTACC GCACGCATAA CTTCTAAACA TAATCTTGCA CGATTTTTTA ATGAGTCGGC 17280 

ACCGTAATGG TCTGTACGTT TATTCGAAAA AGTTGAGAAA AATGTTTGAA TCAGCAAACG 1734 0 

20 TTGTGCAATC GAAATTTCCA CACCATCAAA ACCTGCTTTA ATCGCGCGTA ATGTAGCATC 17400 

GCGATACTGC TGAATGATGC TATTGATTTT CTCATGAGAC ATGGCGATAA CATCGTGTTC 17460 

AATCGGTGAA TGCAATGTCA TAGGGCTTGG TCCATACACC TTTCCAAAAT TTAAAATGGC 17520 

25 TTGATTTGAA AAACGACCAG CATGCGCTAg CTGGATAATA GCGAGGCTAC CATGTTGTTT 17580 

CAT CG TAG AT GCCATGTTAG TTAATCCAGG GATACAAGCA TCATGATCAA TATTAAAGCC 17640 

ATATTCAAAC AATTGACCAT AAGGTTCAAT GTAAGCAGCG CCGGTGACTT GCATTCCAGC 17700 

30 TGAATTAGAG CGACGTGCAG CATAAGCCAA GTCTTCTTTT GTAATATAGC CTTCTTTTGT 17760 

TGATGTGTTT ACGGTCATTG GTGATAATAC AAAGCGATTC GAAATTTTGA TGCCATTAGG 17820 

TAAGTGGATT GATTGTAAAA GTGGTTTGTA TCGGTACATA CTATGATTCC TTTTCTATTC 17880 

35 

AATATTGTTT TCAAAGTACC ATGGAAAGAA TGAATAATCA ATGATGAACA GTCTTGATAG 17 940 

AATAGAATTG GTACATGGAA AGTATTTTTA AAATTAAACT AATGAATGGC ATTTGTAGGT 18000 

CTGAAAATAT GAATATGAAA AAGAAAAATA AAGGCGAAAA GATATAAAAG TTAATTGAAA 18 060 

40 

AACGTTATCA TATACGTGGG TATATGAAGA GGGAATGGTA TTAAGAACGC TAAAATGTTA 18120 

TGTCGGTTTG ACATGACAGG ATAAGTTTGG AGATGACGGA TTGGTTAAAT TAAGCGTATT 18180 

45 AGACTATGCC TTAATAGATG AAGGTAAGGA TGCACAAAAG GCATTGCAAG ATTCAGTGAC 18240 

ACTTGCAAAA TTAGCAGATC GACTTGGCTT TAAGCGAATT TGGTTTACGG AACATCATAA 18300 

TGTACCAGCG TTTGCGTGTA GTAGTCCAGA ACTTTTGATG ATGCATACAT TGGCGCAGAC 18360 

50 AAATCACATA CGAGTTGGCT CTGGTGGTGT GATGCTGCCG CACTATCGAC CTTATAAAAT 18420 

TGCTGAGCAT TTTAGAATGA TGG CAGCGTT ATATCCAAAT CGTATTGATT TAGGTATTGG 18480 
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TAGTTACGAT GAATCGATTT CGTTATTACG TGATTATCTT ACAATAAAGG ATAAACCAAG 18600 

TGCGCATACG TTAGGTGTCC AACCACACAT TGATCATTTT CCAGAAATGT GGTTATTAAG 18660 

TAGTAGCGCA ACATCTGCCA AAATAGCTGC CGAACTAGGT ATAGGGCTTT CTGTTGGAAC 18720 

ATTTTTGCTA CCAGATATAA ATGCGATACA TACAGCGAAG GATAACATTG ATATTTACAA 1878 0 

AAAACATTTC CAAGCATCAA CGATTAAAAT GGACGCAAAG GTGATGGCAT CTGTATTTGT 18840 

CATTGTAGCT GATAACGAAG CGGAAGTAGC AGCATTACAA CATGCCTTAG ATGTTTGGTT 1890 0 

ATTAGGTAAA TTACAATTTG CAGAATTTGA AGATTTTCCT TCAGTAGACA CAGCACAAAA 18960 

GTATAAGCTT AATGATCGAG ACAAAGAGAT GATTCAAGCA CATCAAGCAC GCATCATTGC 19020 

AGGTACACAA GAAAAGGTTA AAGCACAATT AGATGATTTC ATTGCTACGT TTGAAGTTGA 19080 

TGAGGTGTTA GTAGCACCGC TTATTCCAGG TATTGAACAG CGTTGTAAAA CATTAAAATT 19140 

ACTCGCGGAA ATTTATTTGT AGCATTTTAA ATAGAAGAGA AAGGATGAAG ATAAGATGAA 19200 

AAAGTTAGCC AATTATTTAT GGGTAGAAAA AGTAGGAGAT TTGTATGTGT TTAGTATGAC 19260 

ACCTGAATTG CAAGATGATA TTGGGACAGT AGGTTATGTT GAATTCGTAA GTCCAGATGA 19320 

25 AGTTAAAGTG GATGATGAAA TTGTGAGTAT CGAAGCATCG AAAACGGTCA TTGATGTGCA 19380 

AACGCCATTG TCAGGAACGA TTATTGAGCG AAATACAAAA GCGGAAGAAG AACCGACAAT 19440 

TTTAAACTCT GAAAAACCAG AAGAAAATTG GTTGTTCAAA TTGGATGATG TCGATAAAGA 19500 

AGCATTCCTA GCATTACCGG AGGCTTAAAT GGAAACGTTA AAATCAAATA AAGCGAGACT 19560 

TGAATATTTA AT CAATG AT A TGCATCGAGA GAGAAATGAC AATGACGTAT TGGTAATGCC 19620 

ATCTTCATTT GAAGATTTGT GGGAATTATA TCGAGGCTTA GCAAATGTCA GACCGGCATT 19680 

ACCTGTAAGT GATGAATATT TAGCTGTACA AGATGCTATG TTAAGTGATT TGAATCGTCA 19740 

ACATDTTACG GATTTGAAGG ATTTGAAGCC GATAAAAGGT GACAATATCT TTGTTTGGCA 19800 

AGGTGATATC ACGACGTTAA AAATCGATGC TATTGTTAAT GCTGCAAATA GTCGTTTTCT 19 860 

AGGATGTATG CAAGCTAATC ATGACTGCAT TGATAATATT ATTCATACAA AAGCGGGTGT 19 920 

TCAAGTTCGA CTTGATTGTG CAGAGATCAT TCGACAACAA GGGCGCAATG AAGGTGTAGG 19980 

TAAAGCCAAA ATAACACGTG GATATAATTT GCCAGCAAAG TATATAATTC ATACGGTTGG 2 0040 

TCCGCAAATA CGTCGATTGC CTGTTTCAAA GATGAATCAG GACTTGTTAG CTAAATGTTA 2 0100 

TCTTAGCTGT CTTAAATTGG CTGATCAACA TAGTTTAAAT CATGTCGCTT TTTGCTGTAT 2 016 0 

50 ATCTACAGGT GTATTTGCTT TTCCTCAAGA TGAAGCAGCA GAAATTGCTG TTCGAACAGT 2022 0 

AGAAAGCTAT CTCAAAGAAA CAAATTCAAC ATTGAAAGTC GTGTT CAATG TATTTACAGA 2 0280 
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CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 20400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 2 0460 

CATATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 2 0 520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 2 0580 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCGCTT 20640 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 20760 

AACAGTGTAG cTCAGCATTG TCATGCTCAA ACGTATCGCA ATGATGATTT AATTCGTAAA 2 0820 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 20880 

TGTGATGCCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 2 0940 

20 GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 21000 

GTGTTGTATT TGGAAATTGG AATTGGTTAT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC CTTTATATGA CGATGAATAA AAAGGCATAT 21120 

25 CGCATTCCGA ATTCAATTCA AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

ATT A C AG CAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 21240 

GATGTACTTA ATAGAACCGA TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 213 00 

30 CGCTATGCAA GTTTATGTTA ACCAGCATAT CTTTTTAGAT GAAGATATTT TATTCCCTTA 21360 

TTATTGTGAT CCAAAAGTGG AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 21420 

AGATTATATA GATAAACACA GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

GTATGTTGAT AAAGGTGCCG TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA 21540 

TGGT5ATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

AGATGTGGTA CAAAGCGGTA GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

CGC AATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT C G AAAGGG AT 21780 

TAAATCTGTG CGCGCACGTG TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

50 TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

TCG AAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTT CTGTTG AACAAAATCG 220 8 0 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACGCATCAGT TAAAGCAATT 222 00 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 22260 

5 

TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 22 320 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 223 80 

AAAGTATGAT ATATATATGG TTTTTGTTTC TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

10 

TTTAATATTT TAAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 22500 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAGACA 2256 0 

^ CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22620 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACGT 226 80 

GATTTAAGTA AAGCTGATAA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTCCACAGAG 2274 0 

20 CGATTAATGT TTGTCGAAGC GGATTTATCA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22800 

GATTGCAAGT ATGTCTTGAG TGTAGCATCT CCGGTGTTTT TCGGTAAAAC AGACGATGCA 22 8 60 

GAAGTGATGG CGAaCTGcAA TTGAAGGTAT ACAACGTATT TTAAGAGCTG CAGAACATGC 22920 

25 GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22 980 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 23 040 

ATATGAAAAA TCAAAATTGT TAGCTGAAAA GGCAGCGTGG GATTTTGTTG AGAATGAAAA 2310 0 

10 

TACAACAGTA GAATTTGCCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 2 3160 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 2 3 220 

ACCG CAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 232 80 

35 

AATGACAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACmAATTwA 23340 

tTTGTTGGGA ATTGcCAAAt TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 234 00 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 234 3 9 

40 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 522 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 6 0 
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TATTATGCAG TCGATTTAGG GAAATCATAT CGTCTAATTG ACGAAAGCAT GTTAGAGGAT 180 

TTGAAGTTAA CTGAACAACA AATAAGAGAA ATGTCTCTGT TTAATGTTAG AAAATTGTCA 24 0 

AATTCATATA CGACTGATGA AGTAAAAGGT AATATTTTTT ATTTTATTAA CTCAAATGAC 3 00 

GGGTATGATG CAAGTAGGAT ACTAAATACT GCATTTTTAA ATGAAATTGA GGCACAATGT 3 60 

CAAGGCGAAA TGCTCGTAGC AGTGCCACAC CAAGATGTGT TAATTATTGC AGATATACGC 42 0 

AATAAAACAG GATATGATGT GATGGCACAT TTAACAATGG AATTTTTCAC TAAAGGTCTA 4 80 

GTTCCAATTA CATCATTATC CTTTGGATAT AAACAGGGTC ATCTTGAACC GATATTTATT 54 0 

TTAGGTAAAA ATAATAAACA AAAAAGAGAT CCAAACGTGA TTCAGCGTTT AGAAGCAAAT 6 00 

CGTCGTAAAT TTAATAAAGA TAAATAGAAA TAATTGGATA AGGAGTTTTG TCATAATGAA 660 

TTTATTTTAC AATCCTAAAT ATGTAGGAGA TGTCGCATTT TTACAAATTG AACCAGTTGA 720 

AGGTGAATTA AACTACAATA AAAAAGGTAA TGTTGTTGAA ATTACtAATG AAGGTAATGT 780 

TGTAGGTTAT AATATTTTTG AAATTTCAAA AGATATAACA ATTGAAGAAA AAGGTCATAT H4U 

TAAATTAACT GATGAACTTG TAAATGTATT CCAAAAGCGT ATTTCAGAAG CTGGTTTTGA 900 

TTATAAATTA AATGCTGATC TATCACCGAA ATTTGTAGTT GGCTACGTTG AAACTAAAGA 96 0 

CAAACATCCT GATGCAGATA AATTAAGTGT ACTAAATGTA AACGTTGGAA ATGACACATT 102 0 

ACAAATTGTA TGTGGCGCGC CTAACGTTGA AGCTGGACAG AAAGTTGTTG TTGCTAAAGT 108 0 

30 AGGTGCAGTG ATGCCTAGCG GTATGGTAAT TAAAGATGCT GAATTACGTG GTGTTGCCTC 114 0 

AAGCGGTATG ATTTGTTCAA TGAAAGAATT GAATTTACCT AATGCACCTG AAGAAAAAGG 12 00 

TATTATGGTA TTAAATGACA GCTATGAAAT TGGACAAGCA TTtTTTGAAT AATTAAGGAA 126 0 

GGTAGTGAAA ATATGAGCTG GTTTGATAAA TTATTCGGCG AAGATAATGA TTCAAATGAT 132 0 

GACTTGATTC ATAGAAAGAA AAAAAGACGT CAAGAATCAC AAAATATAGA TrACGATCAT 13 80 

GACTCATTAC TGCCTCAAAA TAATGATATT TATAGTCGTC CGAGGGG AAA ATTCCGTTTT 14 4 0 

CCTATGAGCG TAGCTTATGA AAATGAAAAT GTTGAACAAT CTGCAGATAC TATTTCAGAT 1500 

GAAAAAGAAC AATACCATCG AGACTATCGC AAACAAAGCC ACGATTCTCG TTCACAAAAA 156 0 

CGACATCGCC GTAGAAGAAA TCAAACAACT GAAGAACAAA ATTATAGTGA ACAACGTGGG 1620 

AATTCTAAAA TATCACAGCA AAGTATAAAA TATAAAGATC ATTCACATTA CCATACGAAT 16 80 

AAGCCAGGTA CATATGTTTC TGCAATTAAT GGTATTGAGA AGGAAACGCA CAAGCCAAAA 174 0 

ACACATAATA TGTATTCTAA TAATACAAAT CATCGTGCTA AAGATTCAAC TCCAGATTAT 180 0 

CACAAAGAAA G TTTCAAG AC TTCAGAGGTA CCGTCAGCTA TTTTTGGCAC AATGAAACCT 186 0 
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AAACAAAAAT 


ATGATAAATA 


TGTAGCTAAG 


ACGCAAACGT 


CTCAAAATAA 


ACAATTAGAA 


1980 




CAAGAAAAAC 


AAAATGATAG 


TGTTGTCAAA 


CAAGGAACTG 


CATCTAAATC 


ATCTGATGAA 


2040 


5 


AATGTATCAT 


CAACAACAAA 


ATCAATGCCT 


AATTATTCAA 


AAGTTGATAA 


TACTATCAAA 


2100 




ATTGAAAATA 


TTTATGCTTC 


ACAAATTGTT 


GAAGAAATTA 


GACGTGAACG 


AGAACGTAAA 


2160 




GTGCTTCAAA 


AGCGTCGATT 


TAAAAAAGCG 


TTGCAACAAA 


AGCGTGAAGA 


ACATAAAAAC 


2220 


10 


GAAGAGCAAG 


ATGCAATACA 


ACGTGCAATT 


GATGAAATGT ATGCTAAACA AGcGGAACgC 


2280 




TATGTTGGTG 


ATAGTTCATT 


AAATGATGAT 


AGTGACTTAA 


CAGATAATAG 


TACAGATGCT 


2340 


15 


AGTCAGCTTC 


ATACAAATGG 


CATAGAGAAT 


GAAACTGTAT 


CAAATGATGA 


AAATAAACAA 


2400 


GCGTCAATAC 


AAAATGAAGA 


CACTAATGAC 


ACTCATGTAG 


ATGAAAGTCC 


ATACAATTAT 


2460 




GAGGAAGTTA 


GTTTGAaTCA 


AGTATCGACA 


ACAAAACAAT 


TGTCAGATGA 


TGAAGTTACG 


2520 


20 


GTTTCGAATG 


TAACGTCTCA 


ACATCAATCA 


GCACTACAAC 


ATAACGTTGA 


AGTAAATGAT 


2580 


AAAGATGAAC 


TAAAAAATCA 


ATCCAGATTA 


ATTGCTGATT 


CAGAAGAAGA 


TGGAGCAACG 


2640 




aATAAAGAAG 


AATATTCAGk 


AAGTCAAATC 


GATGATGCAG 


AATTTTATGA 


ATTAAATGAT 


2700 


25 


ACAGAAGTAG 


ATGAGGATAC 


TACTTCAAAT 


ATCGAAGATA 


ATACCAATAG 


AAACGCGTCT 


2760 




GAAATGCATG 


TAGACGCTCC 


TAAAACGCAA 


GAGTACGCAG 


TAACTGAATC 


TCAAGTAAAT 


2820 




AATAT CGAT A 


AAACGGTTGA 


TAATGAAATT 


GAATTAGCAC 


CGCGTCATAA 


AAAAGATGAC 


2880 


30 


CAAACAAACT 


TAAGTGTCAA 


CTCATTGAAA 


ACGAATGATG 


TGAATGATAA 


TCATGTTGTG 


2940 




GAAGATTCAA 


GCATGAATGA 


AATAGAAAAG 


AATAACGCAG 


AAATTACAGA 


AAATGTGCAA 


3000 




AACGAAGCAG 


CTGAAAGTGA 


ACAAAATGTC 


GAAGAGAAAA 


CTATTGAAAA 


CGTAAATCCA 


3060 


35 


AAGAAACAGA 


CTGAAAAGGT 


TTCAACTTTA 


AGTAAAAGAC 


CATTTAATGT 


TGTCATGACG 


3120 




CCATCTGATA 


AAAAGCGTAT 


GATGGATCGT 


AAAAAGCATT 


CAAAAGTCAA 


TGTGCCTGAA 


3180 




TTAAAGCCTG 


TACAAAGTAA 


GCAAGCTGTG 


AGTGAAAGAA 


TGCCTGCGAG 


TCAAGCCACA 


3240 


40 


CCATCATCAA 


GATCTGATTC 


ACAAGAGTCA 


AATACAAATG 


CATATAAAAC 


AAATAATATG 


3300 




ACATCAAACA 


ATGTTGaGAA 


CAATCAACTT 


ATTGGTCATG 


CAGAAACAGA 


AAATGATTAT 


3360 


45 


CAAAATGCAC 


AACAATATTC 


AGAGCAGAAA 


CCTTCTGTTG 


aTTCAACTCA 


AACGGAAATA 


3420 


TTTGAAGAAA 


GTCAAGATGA 


TAATCAATTG 


GAAAATGAGC 


AAGTTGATCA 


ATCAACTTCG 


3480 




TCTTCAGTTT 


CAGAAGTAAG 


CGACATAACT 


GAAGAAAGCG 


AAGAAACAAC 


ACATCCAAAC 


3540 


50 


AATACTAGTG 


GACAACAAGA 


TAATGATGAT 


CAACAAAAAG 


ATTTACAGTC 


ATCATTTTCA 


3600 




AATAAAAATG 


AAGATACAGC 


TAATGAAAAT 


AGACCTCGGA 


CGAACCAACA 


AGATGTTGCA 


3660 
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CCAAGTGTTT CATTACTAGA AGAACCACAA GTTATTGAGT CGGACGAGGA CTGGATTACA 378 0 

GATAAAAAGA AAGAACTGAA TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 3 84 0 

5 GATGTAACTG AAGGTCCAAG TGTTACAAGA TTTGAATTAT CAGTTGAAAA AGGTGTTAAA 3 900 

GTTTCAAGAA TTACGGCATT ACAAGATGAC ATTAAAATGG CATTGGCAGC GAAAGATATT 3 960 

CGTATAGAAG CGCCTATTCC AGGAACTAGT CGTGTTGGTA TTGAAGTTCC GAACCAAAAT 4 02 0 

W CCAACGACAG TCAACTTACG TTCTATTATT GAATCTCCaA GTTTTAAAAA TGCTGAATCT 4 080 

AAATTAACAG TTGCGATGGG GTATAGAATT AATAATGAAC CATTACTTAT GG AT ATTGCT 414 0 

AAAACGCCAC ACGCACTAAT TGCAGGTGCA ACTGGATCAG GGAAATCAGT TTGTATCAAT 420 0 

75 

AGTATTTTGA TGTCTTTACT ATATAAAAAT CATCCTGAGG AATTAAGATT ATTACTTATC 42 6 0 

GATCCAAAAA TGGTTGAATT AGCTCCTTAT AATGGTTTGC CACATTTAGT TGCACCGGTA 43 20 

ATTACAGATG TCAAAGCAGC TACACAGAGT TTAAAATGGG CCGTAGAAGA AATGGAACGA 4 3 80 

20 

CGTTATAAGT TATTTGCACA TTACCCATGT ACGTAnTATA ACAGCATTTA ACnAAAAAGC 44 4 0 

CCCATATGAT GAAAGAATGn CAAAAATTGT CATTGTAaTT GATGAGTTGG CTGATTTAAT 4 500 

25 GATGATGGTC CGCAAGAAGT TG 4 522 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
f D) TOPOLOGY: linear 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

T CAAG-TTT AC GGATACGTAT ATATTTTGCA TGACATTTAG TGCAATAATA TTCATAATTT 60 

GCCCGTTGTT GATAGCTTTC AATGCTGTTA CAAAATCTAG GCGCTCCAAC CTGTTGGCTC 12 0 

40 AATCGTTTAA AATCTTGATC TTTATGTTGA TAACCTTTAC CAGCAATATG CAAGTGATAA 18 0 

TGACACAATT CGTGCAGTAT AATTTTTACA ACAGCATCTT CTCCATAATG CTCATATTGT 24 0 

TTTGGATTAA TTTCAATATC ATGGGACTTT AAAAGATAAC GTCCGCCTGT TGTACGTAAC 3 00 

45 

CTTTTATTAA AATATGCACA ATGTCGAAAC GTACGTCCAA ATTTTTCTTC CGAAAGATTC 360 

TCAACCATTC GCTGAAGTTT GTCATTATTC ATGTGGATCA ATCATCGTTA ATGATACTTT 420 

GTCTTTATTT TTGTCAATAC TGTAAATCCA AACGTCAACG ATATCACCAA CACTGACAAT 4 80 

50 

ATCCATTGGA TTTTTTACGA ACTTCTTAGA AAGTTTCGAA ACATGGACAA GTCCATCTTG 54 0 
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TTTCATTCCT TCTTGTAAAT CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC 66 0 

AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 72 0 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

TCTCCAGCTT TAACTTGATC TGGCACTTTA ACAATTGTCT GATCCATACA TACGCGACCA 60 

20 ATAACTTCGC ATTGATGACC ATTTACATTT ACAAAGCTAC CTTGCATTAT GCGTAAATGG 12 0 

CCATCTGCAT ATCCAATAgG TAACAATGCT ATTGTAGTTG GGTCAGTAGC TGTATAAGTT 180 

GCACCATAAC TTACAGACTC ACCCGCTTGT AGCGTCTTTG TTTGAACTAC ATTAGCAATT 24 0 

25 AATTGCACAC TTGG TTTAAG GTGTACTTTA ACTTTTTGCT GTACATACTC TGATGGATAA 3 00 

TATCCATAAA GGGAAATTCC TGGTCTTATT GCATTACAGA ATTGGCAATC CATTAATAGA 3 60 

GAGCCTGCTG AGTTCTGACA ATGTATATAT TCAGGTTTAA TTGCTTCATT GAC CATATCT 42 0 

TTAAAACGTT GATATTGTTC AGTTGTCATA TCTCCTGGTT CGTCAGCACA GGCAAAGTGT 4 80 

GTAAACACGC CTTCAAATAC AAGTTGCTCA TATTGTTGAA TGATTTCAAT CACTTCTTGA 54 0 

TACGTTTTAG TATCTTTAAT ACCTAAACGT CCCATTCCTG TATCTAATTT AATGTGCAAC 6 00 

CATAACTTTT TCTCTTGCTC ACCAGAAATG TTTTTAATTG CTTCTTTCAA CCACTGTTTA 6 60 

GACGGAACCG TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT CAATATCTTT AGCTGGTAAC 72 0 

ACACCTAAGA CTAAAATTTT AGCAGTAATC CCATGCATTC TAAGTTCTAT CGCTTCATCT 780 

AACGTTGCTA CAGCAAAAAA TGTGGCGCCA TTTTCCATTA AATGACGTGC TACTTTAACA 84 0 

CTACCTAGTC CATAGGCATT GGCTTTAACG ACAGCCATCA CTGTTTTATT TGGATGCAAT 900 

GTACTGAATA CTTTGAAATT TGATGCAACA GCGTTTAAAT CTACATTCAT ATACGCAGAT 960 

CTATAATATT TATCCGACAT ATTACTTCCT CCTGTAATTC CCACACGTTT TAAAACTAGA 1020 

TCTTAATTAT CATTGTATAA CAAATTTAAA ATGCTGACTT TTCTAAAACA ACTTGG 1076 

5Q (2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2930 base pairs 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 6 0 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 120 

w 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCVCCAGG 180 

ACCAG CTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 24 0 

TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 300 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 360 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

20 GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 4 80 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

25 TGCAAGATAT TTACTTTTTA GAGCAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 6 60 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 720 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 78 0 

30 GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 84 0 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 900 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 960 

35 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 102 0 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 10 60 

CATGGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 114 0 

40 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 12 00 

ATTATAGCTA CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 126 0 

GATTATCCAA AATGGACAAT GTATATTCAA GTAATGACTG AGGAACAAGC TAAAAACCAT 132 0 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 13 8 0 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 144 0 

50 GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 150 0 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 156 0 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 16 80 

CATGGTAAAT TTGATTCTCA ACCTGAATAT AAAAAGCCAC CATTCCCAAC TGATGGATAC 174 0 

5 GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 1800 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG 186 0 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCGTC ATTGTTACAA AGCTGACCCA 1920 

10 GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 198 0 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 2 04 0 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

15 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 216 0 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 222 0 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 2 28 0 

20 

GATTTGAAAA TAAAGCGGGC IT AC CT ATT A TATTGGGGAG CTCGCTTTTT TATGAAATTT 234 0 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 24 00 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 24 6 0 

25 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2 520 

GCGGAATTTA ATAAGTACGA AGTAGTTCTG GGTATGTTTT ATAAATGTTC GATAATACAC 258 0 

3Q TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 264 0 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 2 7 00 

AGGTAACTAT ATATGGCTAA GAAATCTAAA AT AG CAAAAG AGAGAAAAAG AGAAGAGTTA 2 76 0 

35 GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 28 BO 

AGACCTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 2 93 0 

40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 606 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 

45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

50 

CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 6 0 



55 



1 0 



15 



20 



25 



30 



35 



40 



50 



EP0 786 519 A2 

TTATAAAAAA CTAATTTTAC AAATGCTTTT GCGTTCTTAC AAAAAATGCA TTTGACTATT 18 0 

ATTATAATAA GCGTATAATT GTCGCATATT ATTTTTTGTA TTTTTGGCAA TAACGAAGGA 24 0 

GTATTTATGA ATAAAGACAA GCAATTGCAC AACGACAAAA TCAATCTATC CCAATTAGTC 3 00 

TTATTAGGGT TAGGCTCTTT AATAGGATCT GGTTGGCTAT TTGGTGCGTG GGAAGCATCA 3 60 

TCAATAGCTG CACCAGCAGC AAT CAT AT CA TGGGTTCTTG GATTCCTAGT CATTGGAACC 4 20 

ATTGCCTATA ACTACATTGA AATCGGCACA ATGTTTCCTC AATCAGGTGG CATGAGTAAC 4 80 

TATGCCCAGT ATACACATGG CTCATTATTA GGCTTTATTG CTGCTTGGGC GAATTGGGTG 54 0 

TCTTTGGTGA CAATAATACC TATCGAAGCT GTGTCAGCTG TTCAATATAT GAGTTCTTGG 6 00 

CCGTGGCATT GGGCGAAACC AATGAGATAT TTAATGGAAA ATGGCTCTAT TAG CA CAT AC 66 0 

GGATTGCTAG CTGTATATCT CATCATTGTT ATTTTTTCAT TATTAAACTA TTGGTCCGTA 72 0 

AAACTTTTAA CATCATTTAC GAGTTTAATT TCTGTATTTA AATTAGGCGT ACCCATGTTA 780 
ACCATCATCA TCttTGATGCT ATCAGGATTC GACACTTCAA ATTACGGCCA TTCGGCAAGC 84 0 

ACATTTATGC CTTACGGAAG TGCACCGATT TTTGCTGCAA CAACAGCATC AGGGATTATT 900 
TTTTCATTCA ATT C ATT CCA GACAATTATT AATATGGGTT CAGAAATTAA AAATCCTGAA 960 

AAAAATATCG CAAGAGGCAT CGCTATCTCA CTGTCAATCA GTGCAGTGTT GTACATCATT 1020 

TTACAAAGTA CGTTTATCAC TTCTATGCCT CAATCAATGT TACAACATAG TGGATGGAAT 1080 

GGCATCAACT TCAATTCACC ATTTGCTGAT TTAGCTATCT TATTAGGAAT TAATTGGCTC 114 0 

GCAATTTTAC TATACATTGA AGCTTTTGTA TCACCATTCG GTACTGGCGT GTCATTTGTC 12 0 0 

GCCGTTACAG GTCGAGTTTT ACGAGCAATG GAGAAAAATG GACATATCCC TAAATTTCTT 126 0 

GGG AAGATGA ATGAAAAATA TCATATCCCA CGTGTAGCAA TCATCTTTAA TGCCATCATT 1320 

AGTJCTGATTA TGGTTACATT ATTTAGAGAT TGGGGTACGC TAGCAGCAGT TATTTCTACT 13 8 0 

GCAACTTTAG TAGCCTATTT AACTGGCCCA ACGACAGTGA TTGCATTAAG AAAAATGGGA 144 0 

CCAACAATGA CTCGTCCATT TAGAGCAAAA ATTTTAAAAG TAATGGCACC ATTATCATTT 150 0 

GTATTAGCTT CATTAGCTAT AT ATT GGG CA ATGTGGCCAA CAACGGCTGA AGTTATTTTA 156 0 

ATCATTATAC TTGGATTACC AATCTACTTC TTCTATGAAT ATCGTATGAA TTGGCGTAAT 162 0 

ACAAAGAAAC AAATTGGTGG TAGCTTATGG ATTATTGTAT ATTTAATCGT GCTATCAATA 16 3 0 

CTGTCATTTA TAGGAAGCAA AGAATTTAAA GGCTTAAATA TGATTCACTA TCCATTTGAC 174 0 

TTTATCGTTA TTATTATTGT GGCACTTATC TTCTATTACA TCGGTACAAC GAGTTCATTT 18 0 0 

GAAAGCGTCT ATTTCCGTCG CGCAACACGA ATCAATACGA AGATGCGTGA GTCACTAAAT 186 0 
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CACACACATT AACCAACCAT TGATTTCAAC ATCTTGGTTG GTTTTTTATT TTGAAAATCG 198 0 

GTTATAAATA ACTAACATAA CAAGATGATG ATCAGGCTGG GACATAAATC AATGTTCTAT 204 0 

GCTCTACGAA gTTATATTGG CAGTAGTTGA CTGAACGAAA ATGCGCTTGT AACAAGCTTT 2100 

TTTCGATTCT AGTCAGGGGC CCCAACACAG AGAATTTCGA AAAGAAATTC TACAGGCAAT 216 0 

GCAAGTTGGG GTGGGACGAC GATAAAGAAA TACTTTTTCT ATAGAAATTA GTATyt CTTA 222 0 

TGCATGAGTT TTACTCATGT ATTCATATTT TTAAGTACAC ATTAGCTGTG GCTAATGTAT 228 0 

AAGAACCACT A CAT AAT AAA TCATTTGTGG CTCTTTATCA TTTCTGTCCC ACTCCCGTAG 234 0 

AAGTACATCA TATAATGCTG AAAATGGTTT GAGTTAAAAC AGATATCAAG CTCGTCTGAT 24 00 

TCAGTCACAA AATTGTCTTG TTATACTTGT CACCTATCAT CTATAGACCG TGGTATGATT 24 6 0 

AAATTGGGGA TGATAAAGGA GGTTAATAAA TATGAAGATT AATACTACAG GTGGTCAAAT 2 52 0 

TCATGGTATT ACACAAGATG GTTTAGATAT CTTCTTAGGC ATTCCTTATG CAGAACCACC 2 580 

AGTTCATGAC AATCGCTTTA AACATTCTAC GTTAAAAACA CAATGGT CAG AGCCAATTGA 2640 

TGCAACTGAA ATACAACCCA TCCCACCGCA ACCAGACAAC AAATTAGAAG ATTTTTTCTC 2 7 00 

CTCACAATCT ACAACTTTTA CTGAACATGA AGACTGTTTA TATCTAAATA TTTGGAAACA 2 760 

ACATAATGAT CAGACGAAGA AACCTGTCAT CATTTATTTT TATGGTGGTA GTTTTGAAAA 2 82 0 

TGGTCATGGT ACAGCCGAAC TCTATCAACC GGCACATTTA GTACAAAATA ACGACATTAT 2 8 80 

CGTTATTACA TGCAATTATC GTTTAGGCGC ATTAGGATAT TTAGACTGGT CATATTTTAA 2 94 0 

TAAAGATTTT CATTCCAATA ATGGCCTTTC AGATCAAATC AATGTCATAA AATGGGTGCA 30 00 

TCAATTTATT GAATCCTTCG GTGGCGACGC TAATAACATT ACTTTAATGG GTCAGTCTGC 30 6 0 

AGGCAGTATG AGCATTTTGA CTTTACTTAA AATACCTGAC ATTGAGCCAT ACTTCCATAA 312 0 

AGTQGTTCTA CTAAGTGGCG CACTACGATT AGACACCCTT GAGAGTGCAC GCAATAAAGC 318 0 

ACAACATTTC CAAAAAATGA TGCTCGATTA TTTAGATACA GATGATGTTA CATCATTATC 3 24 0 

4Q GACAAATGAT ATTCTTATGC TGATGGCGAA gcTAAAACAA TCTCGAGGAC CTTCTAAAGG 3 3 00 

GCTTGATTTA ATATATGCGC CTATTAAAAC AGATTATATA CAAAATAATT ATCCAACAAC 336 0 

GAAACCAATT TTTGCATGTT ATACAAAAGA TGAAGGCGAT ATTTATATTA CTAGTGAACA 342 0 

45 GAAAAAATTA TCGCCGCAAC GCTTTATCGA CATTATGGAA TTAAATGATA TTCCTTTAAA 34 8 0 

ATACGAAGAT GTTCAGACGG CGAAGcAACA ATCTTTAGCG ATTACACATT GTTATTTCaA 3 54 0 

ACAGCCGATG aAGCAATTTT TACmACmACT CAATATACmA GATTCCAACC GCACCAACTA 3 600 

50 TGGCTT 3 6 06 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15109 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D ) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: i 
10 GAAATTAAAA AAGCAATTGG nACAAGATGC 

AAAATTATAC ACTTACGGCG ATAACTGGGG 
TGGTTTGAAA ATGCAACsAG AACAACAAAA 

15 

GAAACAAGAA GAAATTGAAA AATATGCTGG 
ACCTACACCA GGATACGAAT CAACAAACAT 
ACATATTGTT AAAGTTGATG CTGGTACATA 

20 

CATGCGTAAA GATTTAAAAG AmAAATTAAT 
TTAGTGAAAT GAGAGTCTGA AACATATCAA 
TAT AG CT AG A AAGTTAGATA TTTGTATTTT 

25 

AATTTAATTA ATGATAGATT AGTATTATTA 
TCAATGTATC TCTTTATTAA TAAGTTATAT 
GGGATATCAG GGAATGGCTT TCAATTAAAG 

30 

TATGATTTTG TAGAAAGATA TATAACAACG 
AATGAAATGT AAGGGGGATT TCGAGTGACT 
GTAGATGATT TAGTATCTCT ATTTTTATTA 

35 

GTCAGTACAA TTGGTGCTGA TTGTTATTTA 
ATTAATCGTT TTTCAAATGA AGATATTCAA 

40 CCATTTCCTA AAGAATGGCG TATGCATGCC 

GAGCCAGTCA AACATGTTGC TTCAAATGTG 
CAAACTTTAA AGAGACAATC AGAAAAAGTA 

45 TTAGCAAAAG CACTACAAAA AGATT CATCT 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT 
GAATGGAATG CATATTGGGA TCCAGAAGCG 

50 ATTGATATGG TTGCTTTAGA AAGTACGAAT 



IEQ ID NO: 44 : 

AACAGTGTCA TTGTTTGATG AATTTGATAA 6 0 

TCGTGGTGGA GAAGTATTAT ATCAAGCATT 12 0 

GTTAACTGCA AAAGCAGGTT GGGCTGAAGT 18 0 

TGATTACATT GTGAGTACAA GTGAAGGTAA 24 0 

GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 3 00 

CTGGTACAAC GATCCTTATA CATTAGATTT 360 

TAAAGCTGCA AAATAATT^A fiCTATATAAG 420 

TCTTTTGATA TTGTATTAGG CTCTTATTTT 4 80 

TTTAAATAAT AAGTGCCGTT GTTATCGTTC 54 0 

TAGCTAAAGT AGTATACCTG AGAAAATAGC 600 

CATAATTATT TTAGTGCATA CTTTATGGAA 660 

AAGAGGTTTA AAAGGATTAC AACAGAATGT 720 

TTTTATAAAA ACATAATATT GTTAATGGAA 780 

AAGAAAGTTT ATTTTAACCA CGATGGTGGT 84 0 

TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

GAGCCATCTT TGAGCGCATC AGTAAAAATT 960 

GTTGCGCCAT CATATGAACG AGGAAAAAAT 1020 

TTTTTTATGG ACGCATTGCC AATTTTAAAT 108 0 

AGCGACAAAG AAGCCTTTGA AGACATTATT 1140 

ACATTATTAT TTACAGGCCC GCTTACAGAT 120 0 

ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 126 0 

GTTGAAGAAC CTGAGCATGA TGGTTCTGCA 13 20 

GTTAAAATTG TTTTTGATAG CGATATAGAG 13 80 

CAAGTACCGC TAACGTTAGA TGTTAGACAA 144 0 



GTACCACCAT TAACACACTT TATAACAAAT 
ACTGCTTATA TTGGTAACAA GGACTTGGTT 

5 AGTTATGGAC CAAGTCAAGG TAAGACATTT 

ATAAATCATG TAGATAACAA CGCATTTTTT 
AATTAACAGC TGTGTAGAAT AATTAAGGTT 

10 TTTTCATTTC TTAAAGTTTA CAATGGTGCT 

TAAAAAATGA CAACAAAACA GTTAGTATAT 
TTAGGATTGG TACCGGTAAT TCCACTACCA 

75 

ATTGGTATTT TCTTAGCAGG TGCGATTTTA 
GTCTTTTTAT TATTAGTAGT TGCTGGCTTG 
GGTGTATTCG CAGGTCCTTC AGCAGGGTTT 

20 

ATTGGGGCGA TTCGAGATAG ATTCATCAAT 
ATTTTAGTTT TTGGTGTTAT AGCATTAGAT 
ATTAACATAC CATTTACGAA AGCTATTTCA 

25 

TTAAAAGCAA TTGTAGCAAG TTTGATTGGT 
CAAATTATGG GAATAAAATA ATCATATTTA 
GAAATTTATA AAAGTGAAAG GAGTAGGTGT 

30 

ATTGTAACGG CACTATATTT GAAAATGACG 
TTCCACAACG ATACACAAGT AACACATGGA 
GG CTGAAAGA CTTAGAACGT CAACATCAAT 

35 

CCTTTAGTTT CCCGGAAAAT GAACAACTTG 
TGAATTTTGA ACTAGGTATT ATGGAATTGT 
TGCCGCGTAA CTCTGACGTT GAAATTGCCA 

40 

TAAAAGTTGC ATATCAGTTT AGTTTGCCAT 
AAATGGTAAG GGAACATTAT CAAAAAGATG 

45 ATGAACCTAT TGGCGTTGTA GATGTCATTG 

TTGGTGTATT AGAACAATTT CGGCACCAAG 
GTGAATACGC CATATCAAAA AATCACAAAC 

50 CAGCAAAAGA TATGTATGCA AAGCAAGGTT 
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TCTACTTACT TTTTATGGGA TGTTTTAACG 1560 

CATTCAATTG AGAAAAAAGT CGATGTAATA 16 20 

GAGTGTAAAG ATGGGCGCAA AATTAATGTC 16 80 

GATTATATAA CTGCACTTGC TAAAAAAGTA 174 0 

TTAATTTATA TAGAACAACT TATTGTAAAC 18 0 0 

ATAATAATGG TCATGAAATA CGAAAGGAAG 1860 

ACAGCTTTAA TGACAGCGAT TATCGCTATT 1920 

TTTTCTTCAG TACCAATTGT ACTTCAAAAC 19 80 

GGACGTAAAT ATGGCACATT AAGTGTTATC 204 0 

CCATTGTTAT CAGGTGGTCG CGGTGGCATC 210 0 

TTACTATTAT ATCCAGTTGT AGCATTCATG 216 0 

GAAATTAATT TCTGGATTTT ATTCGTTGGT 2 220 

GTTATTGGTA CATTGATTAT GGGCATGATT 2280 

ATTTCATTAG CTTATTTGCC TGGTGATATA 234 0 

ACAGCTTTAC TTAATCACTC GCAGTTTCGT 24 00 

AGATAGTAAA GTAATTGAAT AAGTTGCTTT 24 60 

CAATGGCTAG TATAAGTATG TCAGATATAT 2 52 0 

ACGAGCAGTT GATTTATTTA ACGCCTTCTT 25 8 0 

TATATAAAAA GACGCCTACC CAAGAGCGAT 264 0 

TACATACAAA TCAAGGTTCA AATCATTATG 2700 

ATAATCATTG GATGGCTATG TTTAAAGATA 27 60 

ATGCCATAGA AAGTGATGCG CTTGCCAATT 2 820 

TCGTTGACGA GTCGCATATA GATGCCTATT 28 8 0 

TTGGAAAAGA CTATGCAGAT GCACATGAAG 294 0 

TGATTAAACG CTTAGTAGCT TATTTAAATA 3 000 

AAAGTGAAAA TTACATTGAA TTAGATGGAT 3 06 0 

GAATTGGATC TACAATTCAA TCGTTGATAG 312 0 

CAATCATATT AGTTG CAGAT GGTGAAGATA 318 0 

ATGTCTATCA ATCGTTTTGT TATCAAATAT 3240 
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TAAGCTGGTT TCGAGTAGAA ATCAACTTAC TGCTTTTTAA ATTGTTTTGA GCTACTTATA 3360 

CTTATAAAAA TAGTGCGTTT AAATTGTTGA TTCATGTAGA ATATCGTTCA TTATGACACA 34 20 

CTATAATGAA TATGTTATTG TTCAGAATCA ATGATACGTT CTGGATGACT GTATATATTA 34 8 0 

AAGCCACCAT TTCGAATAAA TCCAACTGCC GTAATATTTA GGTCATTAGC TAAGGTTACA 3 54 0 

GCAAGCGTTG TCGGAGCTGA TTTAGATAAA ATGACGCCAA CACCAATTTT TGCGGCTTTA 36 00 

ATTAAAATTT CTGATGAAAT ACGTCCACTA AAAATTAATA CTTTATCTCG GACAGTAATA 36 6 0 

TGTCGCTGAA TACAAAATCC ATATAATTTA TCTAGAGCGT TATGTCTACC AATGTCTTGT 3 720 

CGATGTACAA AAAATGTCAA ACCATCGCTT ATAGCAGCAT TATGTAAGCC ACCTGTTTCT 378 0 

TGGTAAATAT GACTTGCACT TTGTAATCGA GTCATCATGT TAATAATTTG CATTGGAGTT 3 34 0 

AAAGTGATTT TAGACATAGA TGTTTTAGCG ATAGCAGCAT CATTTTGAAA ATAAAACTCA 3 900 

CGACTCTTTC CGCAACAAGA TGCAATCATT CGTTTTGTGG AATATTGAAA GCGATCGCCT 3 960 

AAATCTTTAT TAAGTTCAAC ATGGGCAAAA CCTTTACTAT CATCAATCAG TACAGATTTT 4 020 

AATTCATCTC GCTTTAAAAT GGCACCTTCC GAAGCCAGAA ATCCAATGAC TAACTCCTCA 4 080 

AGGTTTGTTG GACTGCATAT AACAGTCGCA AATTCTTCAC CATTCACCAT AATTGTAAGT 414 0 

GGAAATTCTG TCACATATTG ATCTGTTGTA TTGAATAATT TTCCATCTTC ATATCTAACA 4200 

ATTGGTTGAC CTAAAGATAC ATCTTTGTTC ATTATCTAAC CCCTTTAATT AGCTTAAACT 4 260 

TTATTTTAAA GCAATTTGCT TAAAATTTTA ACATATTTGC TTAAGTTTGA AATTTGATTG 4 3 20 

ATAAAAATTA ATAGCGAGCA ATCTGTTTGA TTTAAATTGA ATTCGAGAAT ATACATACTA 4 3 80 

GGGCATCAAT TAATAAATAT CAATCTTATG CAAATTTGAC AATTGTTTGA ATCAATATAT 444 0 

AAACAGGCAA CGGTTCTTTT CAAATATAAT AGTAAGTGTA TAATGAAAAT GTAAATATTA 4 50 0 

TTAAAAATGG GGGTTCACTC AATGAAATTG AAACGTTTAT TTGCTGTTGT GATTGCAATG 4560 

CTTTTAGTAT TAGCTGGTTG CTCTAATTCT AACGATAATA ATGAAAGTAA AAAAGATGAC 4 62 0 

GCAGACAATG GTAAGAAACA AGAGATTCAA GTTGCAGCGG CAGCAAGTTT AACAGATGTA 468 0 

ACCAAGAAAT TAGCTTCAGA ATTTAAAAAA GAGCATAAAA ATGCTGATAT TAAATTTAAC 4 74 0 

TATGGTGGAT CAGGGGCATT AAGAAAACAA ATTGAATCAG GCGCACCTGT TGACGTATTT 4 80 0 

ATGTCTGCAA ATACTAAAGA TGTAGATGCA TTAAAAGACA AGAATAAAGC GCATGATACA 4 860 

TATAAATATG CGAAAAATAG TCTAGTATTA ATTGGTGATA AAGATTCAAA TTACACTTCA 4 92 0 

GTAAAAGACT TAAAAGACAA TGATAAATTA GCATTAGGTG AAGTGAAAAC TGTACCAGCA 4 98 0 

-T^sr.^-Vi^ ^ AAA CWGTA TTTAOATAAC AATAACTTAT TTAAAGAAGT CGAAAGTAAA 5040 
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CAAGGTTTTG 


TGTATAAAAC 


TGACTTATAT 
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AAACAAAATA 


AAAAAATTGA 




516 0 




GTAATTAAAG 


AAGTAGAACT 


TAAGAAGCCA 


ATCACATACG AAGCTGGTGC 


TAPATPAfJAT 


£. U 


5 


AGTAAATTAG 


CAAAAGAGTG 


GATGGAATTP 


TTAAAATCAG 


ATAAAGCTAA 
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AAAGAATACC 


A CTTTG CAG C 
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GTAATCCATG 
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GATATCAATA 


CGAGTTGCTG 


TAATCAGTAC 


GATTATTGTA 


ACGGTTTTAG 
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10 


ATCTAAATGG 


TTGTATCGTC 


GTAAGGGTTC 


GTGGGTTAAA 


GTATTGGAAA 


nTTT A TT^J A T 

\J X X x n X X X 


O u 




ATTACCTATT 


GTTTTGCCGC 


CAACGGTATT 


AGGTTTTATT 


CTATTAATCA 


1 L, 1 1 L 1 Lull 




15 


AAGAGGACCA 


ATCGGTCAAT 


T CTTTG CGAA 

X X X X Kjy—yJf^f^ 


TGTACTACAT 


TTACCTGTAG 


1 Vj x 1 LAV. 1X1 


ccon 


GACAGGTGCT 


GTGATAGCAT 


CTGTCATTGT 

X >J X X X >kJ X 


TAGTTTTCCA 


CTAATGTATC 


rtnLft 1 /\L 1 vj 1 






GCAAGGCTTC 


AGAGGTATAG 


ACACGAAAAT 


GATTAATACA 


GCTAGAACGA 






20 


TGAAACGAAA 


ATTTT P {~*T C A 


/V\X XXV\X XXX 
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AAACGCTCTA 
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TATAATGATG 
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AGTTTTGCTC* 
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TGAGTTTGGT 


GCTACATTAA 
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ATATATTCCA 


AATAAAACGA 

4*4*. X iAX^iUx\d\JlX 


ATACAPTACC 


TTTAGAAATA 


TACTTCTTAG 


i uunnLiirtLTO 


300U 


25 


TAGAGAAAAT 


GAAGCGTGGT 


TATGGGTATT 


AGTGCTAGTC 


GCATTCTCTA 


1 IoIvjVjI 1/\1 


d y fk u 




ATCTACAATT 


AATTT ATTG A 

nn x x x *i x x vjn 


ATAAAGATAA 


ATATAAGGAG 


GTCGACTAGA 




D U UU 




CAATGTGAAA 


TAT CAATT AA 


AGAAPACTTT 


AATTCGCATC 


AATATAGATG 


>\ 1 IbAnLL 


c r\ c n 


30 


AAAAATTTAT 


GCAGTTCGTG 


GTCCATCTGG 


CATTGGTAAA 


ACTACTGTTT 


TAAATATf3AT 






TGCCGGATTA 


CGTAAAGCAG 


ATGAAGCTAT 


TATCGAAGTG 


AATGGGCAAT 


X X X nL 1 


r 1 on 




TACGGCAAAA 


AACGTGAATG 


TTAAAATTCA 


ACAACGACGT 


ATTGGATATC 


TGTTTCAAGA 




35 


CTACCAATTG 


TTTCCTAATA 


TGACGGTCTA 


TAAAAATATT 


ACTTTTATGG 


CTGAACCATC 


OJUU 




TGAACACATC GATCAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 


0 J D U 


40 


TATGACATTG 


TCAGGTGGAG 


AGGCACAACG 


TGTAGCACTT 


GCACGTGCAC 


TTAGCACrAA 


D *± ^ U 


ACCAGATTTA 


ATTTTATTAG 


ATGAACCTTT 


TTCTAGTTTG 


GATGATACTA 


CAAAAGATGA 


DIOU 




GAGTATTACA 


TTAGTTAAAC 


GTATTTTCAA 


CGAATGGCAA 


ATACCAATCA 


TATTTGTGAC 


CC40 


45 


ACATTCAAAC 


TATGAAGCAG 


AACAAATGGC 


TCATGAAATT 


ATTACAATTG 


GGTAATCATT 






TATTTGCCAT 


TAAAGAGTTT 


AGAACGTATT 


TAAAATTGTA 


GAAGTGAATG 


CTTCTATCAG 


^ n 

o D D U 




CATTTTAATG 


ATGTTTTAAA 


CTCTTTTTTA 


GGGG CAGTTT 


TTTTGAGAGA 


CATTGACGCG 


6720 


50 


CGTCATATAA 


TGAAAGTAAT 


GATAAAAAGA 


AAGGATAACT 


TAATGTGAGT 


CAAGAACGTT 


6780 




ATTCAAGGCA 


AATTTTATTT 


AAACAAATAG 


GTGAAATAGG 


TCAAAGCAAA 


ATAAATCAAA 


6840 
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GAGCAGGCAT TGCCAAACTA ATCATTGTTG ATAGAGATTA TATTGAATTT AGTAATTTAC 6 960 

AAAGACAAAC ATTGTTTACT GAAGAAGATG CTTTGAAAAT GATGCCTAAG GTGGTTGCAG 7020 

CTAAAAAGCA TTTGCT AG CG TTACGTAGTG ATGTTGATAT TGATGATTAT ATTGCCCATG 708 0 

TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAAC CG 714 0 

ATAACTTTGA AACACGACAA CTGATTAATG ATTTTGCATA TAAATATCGT ATACCrTXiGA 72 0 0 

TTTATGGTGG TGTTGTACAG AGTACATATA CAGAAGCTGC ATTT AT AC CT GGTAAAACAC 726 0 

CTTG CTTTAA CTGTTTGGTA CCACAATTGC CAGCATTAAA TTTAACATGT GATACAGTAG 73 20 

GGGTCATTCA ACCTGCCGTG ACGATGGCAA CAAGTTTACA ATTAAGAGAT GCGATGAAAG 73 80 

TATTAACGGA ACAACCAATT GACACAAAAA TAACTTATGG CGATATTTGG GAAGGTAGTC 744 0 

ATTATTCATT TGGTTTCAGT AAAATGCAAC GTTCAGACTG TACAACTTGT GGAGATGTAC 7500 

CAAGTTATCC GTATTTAAAC AAGAATGAAC AACGTTATGC AACATTGTGT GGTAGAGACA 756 0 

CTGTACAGTA TGAAAATGCA TCAATTACAC ACGACATTCT TGTTCAATTT TTAAAACAAC 7620 

ATCAGTTAAA TTATCGCAGT AATTCGTATA TGGTTATGTT TGAATTTAAA GGACACCGCA 76 80 

25 TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA TACATGGCAT GACACGCACA TCAGATGCCA 7 74 0 

CACATCTAAT GAATTTATTG TTTGGATAAA AAAAGATAAG ACAAAAGGAG TGTAATATTA 7 8 00 

TGGGCGAACA TCAAAACGTT AAATTGAATC GTACAGTTAA AGCAGCCGTA CTAACGGTAT 7 86 0 

CAGATACTAG AGACTTTGAT ACAGATAAAG GTGGTCAATG CGTGCGCCAA CTATTACAAG 7 92 0 

CAGATGACGT TGAAGTGAGT GACGCACATT ATACAATTGT GAAAGATGAA AAAGTAGCCA 7 98 0 

TCACGACGCA GGTGAAGAAG TGGTTAGAAG AAGATATTGA TGTCATCATT ACGACTGGTG 804 0 

GAACAGGTAT TGCACAACGT GATGTGACGA TTGAAGCAGT AAAACCACTT TTAACTAAAG 8100 

AGAtfcGAAGG CTTTGGGGAA TTGTTTAGAT ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

GTGCATTATT GTCTCGTGCT GTAGCAGGTA CAGTTAATAA TAAATTGATA TTTTCGATTC 822 0 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT TAGAAAAGCT CATTAAACCA GAATTAAATC 8 2 80 

ATCTGATTCA TGAGCTTACA AAATAATTTA TTGATTTGAT TGGCGTTGAA AATCTCCAGA 8 34 0 

TTTACCGCCA GACTTGCTTT CAAGGTAGGT TTCGCCAATA ATCATACCTT TATCAACTGC 84 0 0 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT TGCTGATGCA GCGGTTAAAG CTTCCATTTC 84 6 0 

AACACCGGTT TTGCCAGTTG TAGAGACAGT TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 852 0 

ATTTGTTTCA TCCCAGCTGA AGTGAACATC TATGCCAGTC AATGGTAATG GATGGCACAT 858 0 

^GAATAAGT GT7GATGTAT TTTTGGCAGC CATAATACCA GCGATTTGAG CAGTGTTCAA 864 0 
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AATGCTTGAA TGAGCGACAG CAGTTCTTTT 
TTTGGCGTGG CCTTGTTGAT TAATATGAGT 

5 

TCTAGTATAT CATGAAAAAA TAAAAGTTTT 
GAAACCCAAT CCCAGTTAAA GAAGCAATTC 
CGGCAATTAC GGTAGCACTT GAAAAAAGTC 

10 

CTACTTATGA TATACCAAGG TTTGATAAAT 
TTGATTCACA AGGGGCAAGT GGTCAGAATC 
GTGCAGGTTC AGTTTCTGAT AAATTAGTTG 
GAGCACAAAT ACCTAATGGC GCAGATGCTG 
AAGATACATT TACAATTCGT AAACCATTTT 

20 AAGAAACAAA GACAGGCGAT GTTGTTCTAA 

TCGCGGTCCT TGCAACATAT GGCTATGCAG 
CTGTTATTGC AACAGGAAGC GAATTATTAG 

25 TTCGTAACTC TAATGGCCCA ATGATTCGTG 

GTATTTACAA AACACAAAAA GATGATTTAG 
TGGAAAAACA TGATATCGTT ATTACAACGG 

30 

TACCTGAGAT TTATAAGGCT GTAAAGGCGG 
CTGGTAGCGT AACAACGGTT GCATTTGTAG 
ATCCATCAGC TTGTTTTACA GGATTTGAAC 

35 

GTGGCGCACT AGAAGTCTTC CCGCAAATAA 
AGGGAAACCC ATTCACACGA TTTATACGTG 
CTGTAGTACC TTCAGGATTC AATAAATCAG 

40 

GTATGGTCAT GTTACCAGGA GGGTCACGTG 
TATTGACTGA ATCTGACGCT GCTGAAGAGG 

45 TTACAAAAAG TCTGGTAAGA CAACATTGAT 

TGGTTATACA GTTGCTACTA TTAAACATCA 
GG ATT CAG AC GTCGATCACA TGAAGCATTT 

50 AGGTTTTCAA TAT CAG C AAA CTGTAACACG 

TGAAAAATCT GTTACAATTG ACACCAATAT 



TGTAATTTGT TTGTCTGATA CATCGACCAT 8760 

AAACTCAGTC ATTTTACCCC TCCTAGTGCA 8820 

GGAGATGATT TTTAATGGTA GTAGAAAAAA 8 880 

AACGTATCGT TAATCAGCAG AGTTCAATGC 8 940 

TAAATCATAT CTTAGCAGAA GATATTGTAG 9000 

CACCTTATGA TGGTTTTGCA ATTCGCAGTG 90 6 0 

GCATTGAGTT TAAAGTGATT GATCATATTG 912 0 

GGGATCACGA AGCGGTGCGT ATTATGACTG 9180 

TTGTTATGTT TGAACAAACG ATTGAACTAG 924 0 

CAAAAAATGA AAATATATCT TTAAAAGGTG 9300 

AAAAAGGACA AGTAATTAAT CCAGGGGCTA 93 60 

AGGTTAAAGT TATTAAGCAA CCGAGTGTCG 9420 

ATGTTAATGA TGTATTAGAA GATGGGAAAA 94 60 

CCTTAGCAGA AAAATTAGGT CTTGAAGTTG 954 0 

ATAGTGGCAT CCAAGTCGTT AAAGAAGCTA 96 0 0 

GCGGAGTTTC TGTTGGAGAT TTTGACTATT 9660 

AAGTGTTATT TAATAAAGTA GCAATGCGTC 9720 

ATGGaAAGTA TTTGTTTGGa TTATCTGGAA 97 80 

TATTTGTGAA nCCAGCTGTT AAACATATGT 984 0 

TTAAAGCAAC ATTAATGGAA GATTTTACCA 9900 

CTAAAGCAAC GTTAACAAGT GCTGGAGCTA 9960 

GTGCGGTTGT AGCGATTGCA CATGCTAACT 10020 

GTTTTAAAGC GGGGCATACA GTAGATATTA 1008 0 

AACTTCTTTT ATGATTTTAC AAATTGTAGG 1014 0 

GAGGCATATT GTCTCTTTCT TAAAGTCACA 10200 

TGGGCATGGT AAGGAAGATA TTCAATTACA 10260 

TGAAGCGGGG GCAGATCAAA GTATTGTACA 10320 

TGTAGATAAT CAAAATCTTA CTCAAATTAT 1038 0 

CGTATTAGTT GAAGGCTTTA AAAATGCTGA 10440 
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GAATGTTTGT TATAGCATTA ATGTAAGGGA GCATGAAGAT TTTACAGCAT TTGAGCAATG 10560 

G IT ATT AAA T AAAATTAAAA ATGATTGTGA TACACAATTA ACATAGAGGA TTGAAATGAA 1062 0 

TGAAACAATT TGAAATCGTG ACAGAACCGA TACAAACAGA ACAATATCGT GAATTCACTA 106 80 

TAAATGAATA TCAAGGTGCA GTAGTTGTTT TTACCGGTCA TGTTCGCGAA TGGACTAAAG 10 740 

GCGTCAAAAC GGAATATTTA GAATATGAAG CGTATATTCC AATGGCTGAA AAGAAATTGG 10900 

CACAAATTGG AGATGAAATA AATGAAAAAT GGCCTGGAAC GATAACGAGT ATTGTTCATA 10 860 

GAATAGGGCC ATTACAAATT TCAGATATCG CTGTATTAAT TGCGGTTTCT TCACCGCATC 10920 

GTAAAGATGC CTATCGAGCA AATGAATATG CAATTGAGCG TATAAAAGAA ATTGTTCCGA 1098 0 

TTTGGAAAAA AGAAATTTGG GAAGATGGTT CAAAATGGCA AGGGCATCAA AAAGGGAATT 11040 

ATGAAGAAGC AAAGAGGGAG GAATAAGAGA GATGAAGGTA CTTTACTTCG CAGAAATTAA 1110 0 

AGATATATTA CAAAAAGCAC AGGAAGATAT TGTGCTTGAA CAAGCATTGA CTGTACAACA 1116 0 



TGTAAATGAG GAATTTGTAC AAAAATCGGA TTTCATTCAA CCTAATGATA CTGTTGCATT 112 8 0 

AATTCCACCG GTTAGTGGAG GTTAAGGGAG CATGAAAGCA ATAATTCTTG CAGGTGGTCA 1134 0 

TTCAGTGCGA TTTGGTAAGC CCAAAGCTTT TGCGGAAGTG AACGGTGAGA CCTTTTATAG 114 00 

TAGAGTAATT AAGACATTAG AATCAACAAA TATGTTCAAT GAAATTATTA TTAGTACAAA 114 60 

30 TGCGCAATTG GCAACGCAAT TTAAATATCC AAATGTTGTT ATAGATGATG AGAATCATAA 11520 

TGATAAAGGT CCATTAGCAG GAATTTATAC AATCATGAAG CAACATCCTG AAGAAGAATT 115S0 

GTTTTTTGTC GTTTCTGTTG ATACACCAAT GATTACTGGT AAAGCTGTAA GCACGTTGTA 1164 0 

35 TCAGTTTTTA GTTTCTCATC TTATTGAAAA TCATTTAGAT GTCGCAGCTT TTAAAGAAGA 11700 

TGGACGTTTT ATTCCAACAA TTGCATTTTA TAGTCCGAAT GCATTAGGCG CTATAACTAA 11760 

AGCACTACAT TCTGATAATT ACAGTTTTAA AAATGTATAT CATGAATTAT CAACGGATTA 11820 

TTTGGATGTA AGGGATGTAG ATGCGCCCTC ATATTGGTAC AAAAATATAA ATTATCAGCA 11880 

TGATTTGGAC GCTTTAATTC AAAAATTGTA AGCTGTTAGG AGGTCCACAA ATGGTAGAAC 11940 

AAATAAAAGA TAAACTAGGA CGTCCCATCC GTGACTTACG GTTATCTGTG ACAGATCGGT 12 0 00 

GTAACTTTAG GTGTGATTAT TGCATGCCTA AAGAGGTATT TGGAGATGAT TTCGTATTTT 12 06 0 

TACCTAAAAA TGAACTTTTA ACGTTTGATG AAATGGCTAG AATCGCTAAG GTATATGCAG 1212 0 

AATTAGGTGT AAAAAAAATA CGCATTACAG GTGGAGAACC ATTGATGCGA CGGGATTTAG 12180 

ATGTACTTAT AGCTAAATTA AATCAAATCG ATGGTATTGA AGATATTGGT TTGACTACAA 12240 
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ATGTCAGTTT GGATGCTATT GATGATACGC 
AAGCGACTAC GATTTTAGAA CAAATTGATT 
5 TAAATG TTGT TATACAAAAA GGTATTAACG 

TTAAAGATAA ACATATAGAG ATTCGATTTA 
GATGGGATTT CAGTAAAGTT GTAACTAAAG 

w 

TTGAAATCGA TCCTGTAGAA CCAAAATATT 
AGGATAATGG TGTTCAATTT GGTTTGATTA 
GTACACGCGC AAGGCTGTCA TCAGATGGGA 

15 

ATGGATTTAA CGTTAAAGCG TTTATTCGTT 
AATTTAAAGC TTTATGGCAA ATAAGAGATG 
CAGTTGCCAA TCGTCAACGT AAAAAGATAA 

20 

GGACCACTAC ATATTAAATC ATTAGAGATG 
ACAATATTAT TTATTAAAGT AAAAACGGTC 

2 5 GTTTTTAAAG TTTTTACAAG TTGGCGGGGC 
TACAATAATG TGCAAGTTGG CGGGGCCCCA 
GACAATGCAA GTTGGGGAAC GGGGCCCCAA 

30 TAATGTGCAA GTTGGCGGGG CCCCAACATA 
GCAAGTTGGG GATCAACGAA ATAAATTTTA 
AATCACTACA TAATAAATCT TTAGTGGTTC 

35 

GAGTTGTAAT ATATCTTTTT TAGGTATAAA 
AGATXTAAAT CTAAACAAGA TATAGCCAGC 
AGTTTGATAT ATAATAAATT TAAGTAATTG 

40 

AGAAACATAG GAGGCATCAT ATTATGAGTA 
GGGAGTTAAG TCAGTTAAAG CACTGGTTAA 
TTGTAGTCCT TTTTAAAGTG TATGAAGCTG 

45 

CATTACATTT TGAAATGCTA TGGGATACAA 
ATAAAAAAGA GCTTATTTCT AAATTGCGTT 
5Q TCTATAGTAC TTCTCAAAAG AAATTGTTAG 
GCGTTACAAA CTAAAAACTT aAAAAgcaTG 



TATTTCAATC AATCAATAAT CGTAATATTA 1236 0 

ACGCGACGTC TATTGGTTTG AATGTAAAAG 12420 

ATGATCAAAT CATACCAATG CTTGAATATT 12480 

TAGAATTTAT GGATGTTGGT AATGATAATG 12 54 0 

ATGAAATGCT TACAATGATA GAGCAGCACT 1260 0 

TTGGGGAAGT AGCAAAATAT TATCGCCATA 12 660 

CAAGTGTTTC ACAATCATTT TGTTCTACAT 12 720 

AGTTTTACGG ATGTTTATTT GCAACTGTCG 1278 0 

CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 1284 0 

ATCGATATTC AGATGAGAGA ACTGCTCAAA 12 900 

ACATGAATTA TATTGGTGGT TAATGTGTAG 12 960 

TTTTAATATT TCTGTCTTAC TCCCTAAAAT 13020 

ATATCTATGC CAGATTTAAT AGAAATGATC 13 0 80 

CCCAACACAG AAGCTGACAG AAAGTCAGCT 1314 0 

ACATAGAGAA TTTCAAAAAG AAATTCTACA 13200 

CACAGAAGGT GACGAAAAGT CAGCATACAA 13260 

GAGAATTTCA AAAGAAATTC TACAGACAAT 133 2 0 

TGAGAATATC ATTTCTATCC CACTCTTAAG 133 80 

TTTAACATTG ATGTCACACT CCATGCCATT 13440 

TGTTGTCGAA TAAACAACAA GTTGTCCAAA 13500 

AATTTAATAT TTGTAATAGA TAAAATGCTA 13560 

TATAATAATA TGAATTACAA ACATCTAAGA 13620 

ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 136 8 0 

AAACAACACA TAAGATTTCA ATTGAAGAAT 13740 

AAAAGATTAG CGGTAAAGAA TTGAGGGATm 1380 0 

GTAAAATCGA TGTGATTATC CGTAAAaTCT 13 860 

CTGAAACGGA TGAAAGACAA GTATT CTATT 13 920 

ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

CCAATCTCTA TTCATCATAA TTGCGTCTTG 14040 
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GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT 1416 0 

GATTGCTAAA GCGGC CAT AA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA 14220 

CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CCAACGTAAA AAGTAGATGT 142 80 

CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 14 34 0 

TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATATTAACC AGCTTTGATA 144 0 0 

GCTTGAAATT AAG CAT ACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AGACAATAAT 14460 

ATTTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 14 520 

AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 14 580 

TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 14 64 0 

AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 14 700 

AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG 14 76 0 

TCCGAGT'A'l'G CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGCGT ATTTTGTAAT 14 82 0 

AAGTAACATG CCTAGAAATG GGCCAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 148 8 0 

ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 14 940 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT ACAAAGATTA AAAAAGCTAA 150 00 

AGATCCATCA ATAAAATAAA GTAATTGCGT GATAATTAAA GCAATTAAAC CAATAAATAA 15060 

TAATCGTTTA GGTCCrATTT sATTTACAAA TTTACCTGTA GCAAATCGA 1510 9 
(2) INFORMATION FOR 3EQ ID NO; 45: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 6 0 

TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 12 0 

CTTATCCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG 180 

GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 24 0 

TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 30 0 

GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 3 60 
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AAATTAACGC CAGGCAGACA ACATCAATAT AT AT AT CAT A TTGGACAAGC TAAACGCAgT 48 0 

GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 54 0 

5 CATGATAAGT AATTAATGAG TAAAGCATAC CGGTTATACA ACAACATACA AGATGACACG 6 00 

AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGCAACAC 66 0 

TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 72 0 

10 

GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 78 0 

CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 840 

TAATATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 900 

15 

TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 960 

TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT ACGAATGTGc AAACAAAGTA 1020 

ATCGGTAGAA ATTCAACATA CATAGCGCCG TTTACTGTTA AGTATTCACA TTACAGATGA 1080 

20 

AAAATATAAA ATTCTACATA AT CAAGACCA TGATGTGTAC TTGTTTAACT TATGACTCTA 1140 

TTTGTTTAAC AATTGCGATA ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 1200 

25 TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTATTCTGGT 1260 

AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 13 20 

ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA TGATAAACCG TATGATTGAA 13 8 0 

30 GCGACAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 14 4 0 

TTGCGTAGAA ATATTGGCTA TGTTATTCAA CAAATTGGCT TAATGCCTCA TATGACGATT 1500 

AAAGAGAATA TTGTGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 156 0 

35 CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 162 0 

GCAGAACTAT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 168 0 

CAAGATATTA TTTTAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 174 0 

40 

TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 18 00 

CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTATGTC AGAAGGTAAG 18 60 

GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT 1920 

45 

GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 19 80 

GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 2 040 

sQ ATGAGACAAA AACGTGTTGA TACTATTTTT GTAGTAGATA GTAATAACCA TTTACTAGGT 2100 

TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 216 0 

55 
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ATTTTAAAAA GAAACGTTAG GAATGTACCT GTCGTAGATG ATCAACAGCG TTTAGTAGGA 22 80 

CTGATTACGC GTGCCAATGT TGTTGATATT GTATATGACA CGATTTGGGG CGATAGTGAG 2 34 0 

5 GATACAGTGC AAACAGAACA TGTGGGGGAA GACAcTGCGT CCTCAAAAGT GCATGAGCAA 24 00 

CACACTACTA ATGTCAAAGT ACGTGACATA GGAGATGATA AATCATGATT GAGTTCCTAC 24 60 

ATGAACATGG TGGACAGTTG ATGTCGAAAA CACTGGAACA TTTCTATATT TCTATAGTGG 2 52 0 

W CATTATTACT TGCCATCATT GTTGCAGTAC CTATAGGCAT TTTATTATCA AAAACAAAGC 2 580 

GAACTGCCAA TATTGTATTA ACTGTGGCAG GTGTCTTACA AACTATTCCA ACACTAGCTG 264 0 

TACTTGCTAT TATGATACCG ATTTTTGGTG TTGGTAAAAC GCCTGCAATT GTAGCGCTAT 2700 

15 

TTATTTATGT ATT ATT AC CT ATTTTAAATA ACACGGTACT CGGTGTTCAA AATATTGATA 2760 

GCAACATTAA AGAAGCTGGA AAAAGTATGG GAATGACACA ATTTCAATTG ATGAAGGATG 2820 

TTGAATTGCC GTTAGCATTG CCGCTTATCA TTGGTGGCAT TCGTTTGTCA TCTGTGTATG 28 BO 

20 

TAATTAGTTG GGCTACACTT GCAAGTTATG TAGGTGCGGG TGGATTAGGT G A rrT CATTT 2 94 0 

TCAATGGTTT AAATTTATAT GATCCACTGA TGATTGTAAC TGCAACGGTA CTCGTTACTG 3 00 0 

25 CACTAGCATT AGGTGTTGAT GCCTTATTAG CTTTAGTTGA AAAATGGGTA GTTCCCAAAG 3 06 0 

GCTTAAAAGT ATCTGGATAA TTAGGAGGCT AAGATAATGA AGAAAATTAA ATATATACTT 312 0 

GTCGTGTTTG TCTTATCGCT TACCGTATTA TCTGGATGTA GTTTGCCCGG ACTAGGTAGT 318 0 

30 AAG AG CACGA AAAATGATGT CAAAATTACA GCATTATCAA CAAGCGAATC GCAAATTATT 3240 

TCACATATGT TACGGTTGTT AATAGAGCAT GATACACACG GTAAGATAAA GCCAACATTA 33 00 

GTAAATAATT TAGGGTCAAG TACGATTCAA CATAATGCCT TAATTAATGG GGATGCTAAT 3 360 

35 ATATCAGGTG TTAGATATAA TGGCACAGAT TTAACGGGAG CTTTGAAGGA AGCACCAATT 3420 

AAAAATCCTA AGAAAGCAAT GATAGCAACA CAACAAGGAT TTAAAAAGAA ATTTGATCAA 34 80 

ACGTTTTTTG ATTCGTATGG TTTTGCGAAT ACGTATGCAT TCATGGTAAC GAAGGAAACC 3540 

40 

GCTAAAAAAT ATCATTTAGA GACAGTTTCA GATTTAGCAA AG CATAGTAA AGATTTACGT 3 600 

TTAGGTATGG ATAGTTCATG GATGAATCGT AAAGGCGATG GCTATGAAGG ATTTAAAAAA 3 66 0 

GAGTATGGTT TTGACTTTGG TACAGTGAGA CCAATGCAAA TAGGTCTAGT CTACGACGCA 3 72 0 

45 

TTAAACTCAG AGAAGTTAGA CGTTGCATTA GG TT ATT CT A CAGATGGTCG AATTGCGGCG 3 78 0 

TATGATTTGA AAGTACTTAA AGATGATAAA CAATTTTTCC CACCTTATGC TGCGAGTGCT 3 84 0 

GTTGCAACAA ATGAATTATT ACGGCAACAC CCAGAACTTA AAACGACGAT TAATAAGTTG 3 9 00 

50 

ACAGGAAAGA TTTCGACTTC AGAGATGCAA CG CTTGAATT ATGAAGCGGA TGGTAAAGGT 3 96 0 
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AAAGGTGGTC ATAAGTAATG GAAGGTAATT TATTACAGCA ATTATTCAAT TATTATGTTA 4 0 80 

CGAACTTTGG TTATCTATGG GATTTATTTT TCAAACACTT ATTAATGTCT GTCTATGGTG 414 0 

5 TGCTGTTTGC AgCTTTAATT GGTATTCCAT TGGGAATCTT GCTTGCaAGA TACACAAAAC 420 0 

TTTCTGGATT TGTAATTACA ATTGCAAATA TAATTCAAAC AGTTCCAGTC ATTGCAATGT 426 0 

TAGCTATTTT AATGTTAGTC ATGGGCTTAG GTTCAGAAAC AGTAGTTTTA ACAGTGTTTT 43 2 0 

10 

TATATGCGTT ACTTCCAATT ATAAAAAACA CTTATACTGG TATAGCTAGT GTTGATGCGA 4 3 80 

ATATTAAGGA TGCTGGCAAA GGTATGGGAA TGACACGCAA TCAAGTGCTA CGAATGATTG 44 4 0 

AATTACCGTT ATCTGTTTCG GTTATTATCG GTGGCATTCG TATTGCCTTG GTTGTTGCGA 4 50 0 

TAGGTGTTGT TGCCGTTGGA TCATTTATAG GAGCACCTAC GCTTGGTGAC ATTGTGATTC 4 56 0 

GTGGTACAAA TGCGACGGAT GGCACAACGT TTATTTTAGC AGGTGCGATT CCGATTGCTA 4 620 

2Q TCATTGCAAT CGTCATTGAT GTACTA1TAA GATTTTTAGA AAAACGATTA GACCCAACAA 4 6 80 

CACGACATCG TAAAAATCAA TCTAATCATC GGCCGCAAAG TATTAATATG TAATAGTAGA 474 0 

AGATGTTTAT AATTTAGCGA TTTCGTTTCA TGATTTATAA AAAATGAGGC TACTCAAGGA 4 800 

25 GCTCAAATAA TCTTTGAGTA GCCTTTTTAT AGGTTGTGTT TGTATGCGTT TACACTAAAA 4860 

TAGCAATTAT TATCATGAAA GTTTTTGGAT AAAAAGCGTT AATTATTGTA AAAATACTAA 4 920 

AAAATGAGAT GTTTTATTTA TAATTTTCTG CAAATTTATG ATATTGTTTC TTAATATAT C 4 9 80 

30 ATATTAAAAA TTTGTTTTTC TTAAACATAG GAGGCTTATC TAATTCATGG ACACATCAAA 5 04 0 

ACAATTTAGA GGTGACAACC GATTGCTTTT GGGTATCGTT TTAGGGGTTA TTACCTTTTG 5100 

GCTATTCGCG CAGTCACTTG TTAATCTTGT TGTCCCATTA CAATCAACAT ATAGTAGTGA 5160 

35 

CGTTGGAACG ATAAATATCG CTGTTAGCTT ATCTGCCTTA TTTGCTGGTT TGTTTATCGT 522 0 

AGGTGCTGGT GATGTTGCTG ATAAATTTGG TCGCGTCAAA ATTACTTATG TAGGATTGAT 528 0 

ATTAAATGTT GTAGGTTCAT TACTCATCAT CATTACACCT TTGCCAGCAT TTTTAATTAT 534 0 

40 

AGGTAGAATA ATTCAAGGTT TGTCTGCAGC ATGTATTATG CCATCAACAC TTGCTATTAT 54 00 

TAACGAATAT TATATTGGTA CAAGAAGACA ACGTGCCTTA AGCTATTGGT CTATTGGTTC 54 6 0 

TTGGGGTGGT AGTGGTATTT GTACGTTGTT TGGTGGCTTA ATGGCTACAT ATATAGGTTG 552 0 

45 

GCGTTCAATA TTTGTTGTTT CAATTCTATT AACATTATTA GCAATGTACT TAATCAAACA 5580 

TGCACCTGAG ACTAAAGCAG AACCAATCAA AGGTATGAAA GCAGAAGCTA AAAAGTTTGA 564 0 

50 CGTTATTGGT TTAGTCATTT TAGTAGTGAC GATGTTAAGT TTAAATGTAA TCATCACACA 570 0 

GACGTCTCAT TTTGGTTTAG TTTCACCGTT AATT CTAGGT TTAATTGTTG TGTTTATCTG 576 0 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC AACTATTTCA AACTTCTTAT TAAATGGTGT 58 80 

AGCAGGTGGT GCACTTATCG TTATTAACAC GTATTATCAA CAACAATTAG GATTTAATTC 594 0 

TTCGCAAACG GGTTATATTT CATTAACGTA TTTAATAACA GTGTTGTCAA TGATTCGTGT 6000 

AGGTGAAAAG ATTTTATCTC AACATGGTCC GAAGCGCCCA CTATTACTAG GAAGTGGCTT 606 0 

TACAGTGATT GGGTTAATCT TATTGT CGTT AACATTTTTA CCAGAAGTGT GGTATATCAT 612 0 

ATCTAGTATA GTTGGATATT TATTGTTTGG TACTGGTTTA GGATTATATG CTACACCATC 618 0 

AACTGATACA GCAGTTGCTA GTGCGCCAGA TGATAAGTCG GGTGTTGCTT CAGGTGTGTA 624 0 

TAAAATGGCG TCATCATTAG GAAATGCATT TGGAGTAGCA GTATCTGGTA CGGTTTATAC 63 00 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT AGGTGGTTTC ACAGGTATGA TGTTTAATGC 63 60 

CTTGCTAGCA ATTGTTGCAT TTTTAGTCAT TTTACTATTA GTTCCTAAAA ATCAAACGAA 642 0 

TTTGTAAAAC TGAAATGAAA GCAAGTTATT ATGTAGGGAT TTTAAAGGAA ATTTTGTGAA 64 80 

AGTAAGTTTA TCATACACAC TTAATGTTGC GTATTGACGT TTAATGTTAG GTGTGTTCTT G54 0 

TTATAGACGA TAAAAGCTGT GTGCATATTA AGCGAATGAT TTTCAAATTG ACG CTAAT AT 660 0 

GCGAAAGTAG TATTTTTAAA ATGAACAACA ACGATGAAGA GGGGTTTATA GGATGAAAAT 6660 

TGCAATTGCT GGATCGGGTG CATTAGGTAG TGGCTTTGGT GCCAAACTAT TTCAAGCAGG 6 720 

ATATGATGTC ACACTTATTG ACGGATATAC ATCTCATGTT GAAGCGGTTA AG CAACATGG 6 780 

30 ATTAAATATA ACGATTAATG GAGAGGCATT CGAGTTAAAC ATTCCGATGT ATCATTTTAA 6 84 0 

TGATCAACCG GACGAAAGCA TTTACGATGT TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6 900 

AAAAGAAGTG ATGGAAGATA TGAAGCCACA TATTGATAAT GAAACGATCG TCGTATGTAC 6 96 0 

35 GATGAATGGT CTGAAGCATG AAGAAGTCAT TGCGCAGTAT GTTGCTCAAT CACAAATTGT 7 020 

CAGAGGTGTT ACGACTTGGA CGGCAGGTCT TGAAAGCCCT GGACACAGTC ATTTACTTGG 7 080 

TAGTGGACCA GTTGAAATAG GTGAACTAGT GGATGAAGGT AAAGAAAATG TTATAAAAGT 714 0 

TGCTGATTTA CTTAACGAAG CGGAATTGAA TGGTGTCATT AGTAAAGATT TATACCAATC 72 0 0 

GATTTGGAAA AAGATTTGTG TTAATGGTAC GGCAAATGCA TTAAGCACAG TGTTGGAGTG 726 0 

TAATATGGCA TCGCTGAATG AAAGTAGTTA TGCGAAGTGT TTGATTTATA AATTAACGCA 732 0 

AGAAATAGTG CATGTAGCGA CGATTGATAA TGTTCATTTA AATGTTGATG AAGTATTTGA 73 8 0 

ATATTTAGTT GATTTAAATG AAaAAGTTGG TGCGCATTAT CCATCCATGT ATCAAGATTT 744 0 

AATTGTTAAT AATAGAAAAA CTGAAATTGA TTATATTAAT GGCGCAGTTG CAACATTAGG 7 500 

TAAACAACGT CaTATTGAAG CGCCAGTCAA TCGCTTTATT ACTGATTTAA TTCATACTAA 7 56 0 
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CAATCACGTG ATATTACGGT CATTATTAAG ATTGAAATGT AATAAATAAA GAACAGCAGT 76 8 0 

AAGGTACTTT CAAATTGAAA TGATCTTGGT GCTGTTTTTC TTGATTGATC TTCGTCATAA 774 0 

5 TTCAGATTTG TCATAGGcTA CGACATACTA TTAGTATTTA CTAGACAGTT TTTACGACGA 78 0 0 

CACTTTGAAA AATTTTGAGG CAAATCATTT GGAAGTCTCA CGTGAATTTT GTAAACTCAT 78 60 

CAAGCAAGTA ATTATATTAA AAAGACAAAT AGAGAAAAGG TGTTTATAAT GAGTAAAATT 7 92 0 

10 

TTTGTAACTG GTGCAACGGG CCTTATTGGC ATTAAATTAG TTCAAAGACT AAAAGAAGAG 7980 

GGGCATGAGG TTGCTGGTTT TACTACATCT GAGAATGGTC AACAAAAGCT AGCTGCTGTT 804 0 

AATGTAAAAG CATATATTGG TGATATATTA AAAGCTGATA CTATTGATCA AGCGTTAGCA 8100 

15 

GATTTTAAAC CAGAAATCAT TATCAATCAA ATTACGGATT TAAAAAATGT TGATATGGCA 8160 

GCAAATACGA AAGTACGTAT TGAAGGTTCT AAAAACCTAA TTGATGCGGC GAAAAAGCAT 8220 

20 GACGTTAAGA AAGTAATTGC CCAAAGTATT GCCTTTATGT ATGAACCTGG CGAAGGATTA 8280 

GCAAATGAGG AAACTTCACT TGATTTTAAC TCAACTGGCG ATAGAAAAGT AACGGTTGAT 834 0 

GGTGTGGTTG GTTTAGAAGA AGAAACGGCT CGTATGGATG AATACGTTGT TTTACGTTTT 84 00 

25 GGCTGGTTAT ATGGCCCAGG TACTTGGTAC GGAAAAGATG GCATGATTTA TAATCAATTT 84 60 

ATGGATGGTC AAGTGACACT TTCAGATGGC GTAACATCAT TTGTGCATCT TGATGATGCA 8 520 

GTTGAAACAT CTATTCAAGC TATTCATTTT GAAAATGGTA TCTATAATGT AGCAGATGAT 8580 

30 GCACCTGTTA AAGGTTCTGA ATTTGCAGAA TGGTATAAAG AACAACTTGG TGTTGAACCA 864 0 

AATATTGATA TTCAACCTGC GCAACCATTT GAACGTGGCG TAAGCAATGA GAAGTTTAAA 8 70 0 

GCGCAAGGTG GTACTCTGAT TTATCAAACT TGGAAAGATG GCATGAATCC AATTAAATAA 8760 

35 TAATTTATCC GTTTAATATA CAAAGAATAA AGACTTGGTC GAATCGTGGA TGATATATTA 8 82 0 

TCAAACGCAC GGCTCGAACA AGTCTTTTTT ATTATGTCTT CGTTATCTTT GTATGAAGGA 888 0 

ATAACAGAAT TACAATTAAT GTACTGAATA ATGCAATTAA TGTTGTGATT AGTGCTAATT 894 0 

40 

TAATTTCTAT TGGTAGCCAA GTCAGTACAA AAGACCAATT ATTGCTACCG AGAATGAGAT 900 0 

ATGGTAATGC ATATAATATG AGCGCTAAAG CGATACATAT ACATAATGAT AACCAACTCA 906 0 

ATACAGCAAT CC 9072 

45 

(2) INFORMATION FOR SEQ ID NO : 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46; 

GTGGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAGCTACAT GO 

TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TGTAAATGAA AGTGACAATA 120 

TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 180 

TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 24 0 

TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 300 

TGATCCAAAC TTTAAATACA AAACACGCAT CAATAACATT ACAAGCGAAT TATATGACGC 36 0 

TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 420 

GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AATATTATAG 4 60 

GTTTAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 54 0 

TGATTTTATG TCAGTATTTT TATCCGGAAT ATGTATCTTC TGCGACGTTA CCAACTCAAT 600 

TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG b6u 

AATATAGTAA TCATAAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 720 

GTCTCAAGTA TTCGAGGTTT AATAACAAAA GTAAGGTTGG AAGGATCATC AATTTCTTTA 780 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATGAT CAGATTCTTG 840 

TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 

30 AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTGCAA 96 0 

CTCGTCCAGG TAG CATG ATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 1020 

ATGCTGAAAA TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

TTTCTAAAAA TGCTGACAAT ATCCATGTGA TTCCTAACTG GTATGACATG CGTCAATTAC 114 0 

AAG^CAATCG TATCTATAAT GACACATTTA AAGCTTACCG TGAGCAATAC GACAAAATTT 12 00 

TATTGTATAG CGGTAATATG GGGCAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 126 0 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAATACTTTG TGGTCATGGT AAGAAATTTG 132 0 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATG TTTGAGTTTT 13 8 0 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 144 0 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 1500 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 1560 

ATGCGGGTAT CCAAATTGAT AATGGCGATG CACATGCCAT TTATAACTTC ATCAACACTC 162 0 

ACTCGAGTAA GGAATTGCAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 16 8 0 
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AAGCGATTAT 


TCGATGTAGT 


GAGTTCAATA 


TATGGTTTAG 


TAGTTTTAAG 


TCCGATTCTG 


1800 




TTAATTACAG 


CATTACTAAT 


TAAAATGGAa 


TCACCTGGAC 


CAGCCATTTT 


CAAACAAAAA 


1860 


5 


AGACCGACGA 


TTAATAATGA 


ATTGTTTAAT 


ATTTATAAGT 


TTAGATCAAT 


GAAAATAGAC 


1920 




ACACCTAATG 


TTGCAACTGA 


TTTAATGGAT 


TCAACATCGT 


ATATAACAAA 


GACAGGGAAG 


1980 


10 


GTCATTCGTA 


AGACCTCTAT 


TGATGAATTG 


CCACAATTAT 


TGAATGTTTT 


AAAAGGAGAA 


2040 


ATGTCAATTG 


TAGGTCCTAG 


ACCAGCGCTT 


TATAATCAAT 


ACGAATTAAT 


CGAAAAACGT 


2100 




ACAAAAGCGA 


ACGTGCATAC 


GATTAGACCA 


GGTGTGACAG 


GACTAGCTCA 


AGTGATGGGG 


2160 


15 


AGAGATGATA 


TCACTGATGA 


TCAAAAAGTA 


GCGTATGATC 


ATTATTACTT 


AACACATCAA 


2220 


TCTATGATGC 


TTGATATGTA 


TATCATATAT 


AAAACAATTA 


AAAATATCGT 


TACTTCAGAA 


2280 




GGTGTGCATC 


ACTAATGAGA 


AAAAATATTT 


TAATTACAGG 


CGTACATGGA 


TATATCGGTA 


2340 


20 


ATGCTTTAAA 


AGATAAGCTT 


ATTGAACAAG 


GACAT CAAGT 


AGATCAAATT 


AATGTTAGGA 


2400 




ATCAATTATG 


GAAGTCGACC 


TCGTTCAAAG 


ATTATGATGT 


TTTAATTCAT 


ACAGCAGCTT 


2460 




TGGTTCACAA 


CAATTCACCT 


CAAGCAAGGC 


TATCTGATTA 


TATGCAAGTG 


AATATGTTGC 


2520 


25 


TGACGAAACA 


ATTGGCACAA 


AAGGCTAAAG 


CTGAAGACGT 


TAAACAATTT 


ATTTTTATGA 


2580 




GTACTATGGC 


AGTTTATGGA 


AAAGAAGGTC 


ATGTTGGTAA 


ATCAGATCAA 


GTTGATACAC 


2640 




AAACACCAAT 


GAACCCTACG 


ACCAACTATG 


GTATTTCCAA 


AAAGTTCGCT 


GAACAAGCAT 


2700 


30 


TACAAGAATT 


GATTAGTGAT 


TCGTTTAAAG 


TAGCAATTGT 


GAGACCACCA 


ATGATTTATG 


2760 




GTGCACATTG 


CCCAGGAAAT 


TTCCAACGGT 


TAATGCAATT 


GTCAAAGCGA 


TTGCCAATCA 


2820 




TTCCCAATAT 


TAACAATCAG 


CGCAGTGCAT 


TATATATTAA 


ACATCTGACA 


GCATTTATTG 


2880 


35 


ATCAATTAAT 


ATCATTAGAA 


GTGACAGGTG 


TGTACCATCC 


TCAAGATAGT 


TTTTACTTTG 


2940 




ATACATCGTC 


AGTAATGTAT 


GAAATACGTC 


GCCAATCACA 


TCGTAAAACG 


GTATTGATCA 


3000 


40 


ACATGCCTTC 


AATGCTAAAT 


AAGTATTTTA ATAAGTTGTC 


GGTCTTTAGA 


AAATTATTCG 


3060 


GCAATTTAAT 


ATACAGCAAT 


ACGTTATATG 


AAAATAATAA 


TGCACTTGAA 


ATTATTCCTG 


3120 




GAAAAATGTC 


ACTTGTTATT 


GCGGACATCA 


TGGATGAAAC 


GACAACCAAA 


GATAAGGCAT 


3180 


45 


AAGTCATCTA 


TTAAATAAAA 


TCAACATACA 


AATCGTTTTA 


TTTGGAGGTT 


ATAGTATGAA 


3240 




GTTAACAGTA 


GTTGGCTTAG 


GTTATATTGG 


TTTACCAACA 


TCAATTATGT 


TTGCAAAACA 


3300 




TGGcGTCGAT 


GTGCTTGGTG 


TTGATATTAA 


TCAGCAAACG 


ATTGATAAGT 


TACAAAGTGG 


3360 


50 


TCAAATTAGT 


ATTGAAGAAC 


CTGGATTACA 


AGAGGTTTAT 


GAAGAGGTAC 


TGTCATCGGG 


3420 




AAAATTGAAG 


GTATCTACAA 


CGCCAGATGC 


ATCTGATGTT 


TTTATCATTG 


CCGTTCCGAC 


3480 
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TAGTATTTTA TCATTTTTAG AAAAAGGAAA TACCATTATT GTAGAGTCGA CAATTGCGCC 3600 

TAAAACGATG GATGATTTTG TAAAACCAGT CATTGAAAAT TTAGGGTTTA CAATAGGTGA 3 660 

AGATATTTAT TTAGTGCATT GTCCAGAACG TGTACTGCCA GGAAAAATTT TAGAAGAATT 372 0 

AGTTCATAAC AATCGTATCA TTGGCGGTGT GACTGAAGCT TGTATTGAAG CGGGTAAACG 378 0 

TGTCTATCGC ACATTCGTTC AGGGAGAAAT GATTGAAACA GATGCACGTA CTGCTGAAAT 384 0 

GAGTAAGCTA ATGGAAAACA CATATAGAGA CGTGAACATT GCTTTAGCTA ATGAATTAAC 3900 

AAAAATTTGC AATAACTTAA ATATTAATGT ATTAGATGTG ATTGAAATGG CAAACAAACA 3960 

TCCGCGTGTT AA CATC CATC AGCCTGGTCC AGGTGTAGGC GGTCATTGTT TAGCTGTTGA 402 0 

TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTCAAA CTGGACGTGA 408 0 

AATTAATAAT TCAATGCCGG CCTATGTTGT TGATACAACG AAGCAAATCA TCAAAGTGTT 414 0 

GAGCGGGAAT AAAGTCACAG TATTTGGTTT AACTTATAAA GGTGATGTTG ATGATATAAG 4200 

AGAATCACCA GCATTTGATA TTTATGAGCT ATTAAATCAA GAACCAGACA TAGAAGTATG 426 0 

TGCTTATGAT CCACATGTTG AATTAGATTT TGTGGAACAT GATATGTCAC ATGCTGTCAA 432 0 

AGACGCATCG CTAGTATTGA TTTTAAGTGA CCACTCAGAA TTTAAAAATT TATCGGACAG 43 8 0 

TCATTTTGAT AAAATGAAGC ATAAAGTGAT TTTTGATACA AAAAATGTTG TGAAATCATC 444 0 

ATTTGAAGAT GTATCGTATT ATAATTATGG CAATATATTT AATTTTATCG ACAAATAAAA 4 50 0 

30 TGTGTCAAAC TAGGGCATAC ATGATTAAGG AAAGATAAGC TGTCATGTGT TTGAACTTCA 4 560 

GAGAGGATAA TGTTATGAAA AAAATTATGG TTATTTTCGG TACGAGACCC GAAGCAATAA 4620 

AAATGGCACC ATTAGTAAAA GAAATTGATC ATAATGGGAA CTTTGAAGCG AACATTGTGA 46 8 0 

TTACAGCACA ACATAGAGAT ATGTTAGATA GTGTGTTAAG TATATTTGAT ATTCAAGCTG 474 0 

ATCATGATTT AAATATTATG CAAGATCAAC AAACATTAGC AGGCCTTACG GCGAATGCAC 4 80 0 

TTGCTAAACT TG AT AG CATC ATTAATGAGG AACAACCGGA TATGATTTTA GTACATGGTG 4 86 0 

ATACTACAAC GACTTTTGTA GGAAGTTTGG CAGCATTTTA TCATCAAATT CCGGTCGGAC 4 92 0 

ATGTAGAAGC TGGACTTCGA ACACATCAGA AATACTCACC ATTTCCTGAA GAGTTAAATC 4 98 0 

GAGTCATGGT AAGTAATATT GCTGAATTGA ATTTTGCGCC AACAGTAATT GCAGCTAAAA 504 0 

ATTTACTTTT TGAAAACAAA GACAAAGAGC GTATCTTTAT TACTGGAAAT ACAGTTATTG 510 0 

ACGCATTGTC AACAACAGTT CAAAATGATT TTGTTTCAAC GATTATTAAT AAACATAAAG 516 0 

GCAAGAAAGT TGTTTTACTA ACAGCGCATC GTCGTGAAAA TATTGGGGAA CCGATGCATC 522 0 

AGATTTTTAA AGCAGTAAGA GATTTGGCAG ATGAATATAA AGATGTTGTC TTCATTTATC 52 8 0 
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GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCGT 54 0 0 

ACCTCGTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 54 60 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCGAGAG 5520 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 558 0 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 564 0 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 5700 

TACCTTTACG TCACAAATAA TAAAAAACCC CTAATCATGA AGTTGGTTTA GACAACCAGC 576 0 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAGCCAATAT CATATTTGAA 5820 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 58 80 

TTTATGAGCT TCTTTAAATA CATCGGAATT CAACCAATTA TTAAAGCTAT CTTCAGATTC 5 94 0 

CCAAATAGTT AAGATTTTAA CTTCGTCTGT ATCCTCGGTA TTTAATGTTT TAGTGACAAA 6 000 

CATTTGTTGG AAGCCTTCAA TAGTTTCAAT ACCTTGTCTA TTGTAAAAAC GTTCAATCGT 6060 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT GCCATAAACA TGGGCAATCA 6120 

25 CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAGCGGT 618 0 

TTGTATGATA ACCATTATGA TTAATCCTAC ACGGACTGCA AGAACATCCA CCATATAAAT 624 0 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TAAAATTTTA ATTTTCTGTT GTAGCGTGTA 6 3 00 

GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 63 6 0 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GCTACGAGTA AAAAAGGGGT 64 2 0 

CGTTGGCAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAT 64 8 0 

GACGATTAAA ATAAGTCGCA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT 654 0 

TGTTTTAGTA ATATAACTCA TGCTAAATAT AATGTGTATG ATAAGTGCAA TGACTCAGTA 6600 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTCATA 66 6 0 

ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTTATATTA TGTGTTTTAT 672 0 

TTTTGAAAAG TGCAATATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 678 0 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACGCTTACTA 6 84 0 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATGCTAAGA GATTTATATT 6 90 0 

ATAGC CAATA AACAAAGGAG AGATAATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 696 0 

AATTATGGTT TATTTATCAA TGGGGAATTT GTTAAAGGTA GCAGTGACGA AACAATCGAA 70 2 0 

GTGACTAATC CAGCAACTGG AGAAACACTA TCACATATTA CAAGAGCAAA AGATAAAGAT 70 8 0 
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TCAGAACGTG CACAAATGTT GCGTGATATT GGTGATAAAT TAATGGCACA AAAAGATAAA 72 00 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AGCAATTGAT 72 6 0 

ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 73 2 0 

ACAGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATGAGCC GATTGGCGTC 7 3 80 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GATTGCGCCA 744 0 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCGT CTTCAACACC ATTAAGTTTA 7 500 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 7560 

GGTAAAGGTT CAGAATCAGG TAATGCAATT TTCAATCATG ATGGTGTAGA TAAATTATCA 7620 

15 

TTTACGGGCT CAACTGATGT AGGTTATCAA GTTGCCGAAG CTGCAGCAAA ACATCTAGTA 76 8 0 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 774 0 

20 GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 7800 

GCAGGTTCTC G ATT ATT AG T TCATGAAAAA ATTTATGATC AATTGGTGCC ACGTTTACAA 7 860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7 920 

25 CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATCA 7 980 

GATGCACAAA TTTTAGCAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 804 0 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATTAGC ACAAGAAGAA 8100 

30 ATATTTGGAC CAGTGTTAAC AGTGATTAAA GTGAAGGACG ATCAAGAAGC AATTGATATA 816 0 

G CTAATGATT CTGAGTATGG TTTAGCAGGC GGTGTATTTT CTCAAAATAT CACACGTGCA 6220 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA 82 80 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTATAAA 834 0 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 84 0 0 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 84 6 0 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 852 0 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 8 58 0 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTGCGAT GCGGCTATTA TAAGGACAGA 864 0 

GTTGTTTATT AATTATGGTG ATTTAGAAAT ATGAAGTTCA ATATGCAAAG TCATCGTTTG 87 0 0 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TGAAACTAAA 87 6 0 

CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 8 82 0 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8 8 80 
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GATGATGTAT 


AAATCATGGT 


TAATTACGGA 


AGCATTAATA 


TTAACCTGAG 


AAGCTATAAA 


9000 




GAATTATTTT 


TAAAAGCGAC 


AATATTAAAT 


ACGACGCATT 


TATTTAGGAG 


TGGCAAACGT 


9060 


5 


ATGAATGGGA 


AAAAGGCGAA 


TACGATAAAC 


AGATACAAAT 


ATTTTCATCA 


TGTCAATCAT 


9120 




CAAAAAATTC 


AACAAAGTTC 


TAAAAAGACG 


CTGTGGGCAT 


CACTAATCAT 


CACATTGTTA 


9180 


10 


TTTACAGTGA 


TTGAATTTGT 


CGGAGGTTTA 


GTATCTAATt 


CATTGGCATT 


ACTGTCAGAT 


9240 


TCATTTCATA 


TGCTTAGTGA 


TGTATTAGCA 


CTTGGTTTAT 


CTATGTTGGC 


CATTTATTTT 


9300 




GCAAGTAAAA 


AGCCGACTGC 


ACGATACACA 


TTTGGATATT 


TAAGATTTGA 


GATATTAGCT 


9360 


15 


GCATTTTTAA 


ATGGTTTAGC 


ATTAATTGTA 


ATTTCAATCT 


GGATTTTATA 


TGAAGCTATT 


9420 




GTACGTATTA 


TTTATCCGCA 


ACCAATTGAA 


AGTGGCATTA 


TGTTTATGAT 


TGCTAGTATT 


9480 




GGTTTACTCG 


TCAATATTAT 


TTTGACTGTT 


ATCCTTGTAA 


GGTCTTTAAA 


ACAAGAAGAC 


9540 


20 


AATATCAATA 


TTCAAAGTGC 


ATTATGGCAT 


TTCATGGGAG 


ACTTATTGAA 


CTCTATTGGT 


9600 




GTCATCGTTG 


CAGTTGTATT 


GATTTACTTT 


ACAGGATGGC 


GCATCATCGA 


CCCAATCATT 


9660 




AGTATTGTAA 


TTTCACTCAT 


CATTTTACGT 


GGTGGTTATA 


AAATTACGCG 


TAATGCgTGG 


9720 


25 


tTAATTTTAA 


TGGAAAGTGT 


GCCTCAACAT 


TTGGATACTG 


ATCAAATTAT 


GGCAGATATT 


9780 




AAAAACATAG 


ATGGCATATT 


AGATGTACAT 


GAATTTCATT 


TGTGGAGTAT 


TACAACAGAG 


9840 




CATTATTCA7 


TAAGTGCCCA 


TGTTGTGTTA 


GATAAAAAAT 


ATGAGGGTGA 


TGATTATCAA 


9900 


30 


GCGATTGATC 


AAGTATCATC 


ATTGTTGAAA 


GAAAAATATG 


GCATTGCACA 


TTCAACGTTG 


9960 




CAAATTGAAA 


ACTTGCAATT 


GAATCCATTA 


GATGAGCCAT 


ACTTCGACAA 


ATTAACATAA 


10020 




ATAAAACATT 


GTAGCGCCTA 


AAACATTAAT 


CTATGTCATA 


GGCGCACGTT 


TCGTTTTATA 


10080 


35 


CTTATGTTGC 


ATCATTTAAA 


TGATTTTCGT 


CAATTTCTTT 


GATGCTATCT 


ACATCTAACA 


10140 




CGACATCTTT 


AGGTTTCAAA 


ATATGAATAT 


GTTTTTCATC 


ATTTGTATGT 


AAAATGCGTT 


10200 




CTATGATGTA 


CCTTTGACCG 


GCCATTGTTT 


CTACAGCAAT 


CTTTTTGTTT 


CTAGCTAAAC 


10260 


40 


TTGCTACGAC 


AGATTCTTTA 


TCCATAATGA 


TAGCCCCCTA 


TATATATGTT 


TATTTACTTA 


1O320 




TACCCTAACA 


TGATTTTTAT 


ACT CTTTGAA 


AATATATTTT 


ACAGAATTTT 


ATCTAAATAT 


10380 


45 


TTAAAAAAAT 


ATCTTAATAT 


CCTTGTAATC 


CGATAAGAAT 


TATAGTAATA 


TTTTTTCAAC 


10440 


CATtGTTATA 


GGAGGTCTTA 


TTAATGAPAT 


T UTTTTTi TT 
inl ill inl 1 




a n TrTTP & TT 
MA lUil 1 1 


1UDUU 




TTGCATCAAC 


GAAAGAAGAA 


CTAGAAGCAA 


AGGCAGCATC 


ACTATCTACG 


AAGACAATTC 


10560 


50 


CAACATTAAT 


TGAAGTACAA 


GCTACTGAAA 


ATTTAACTCA 


TGGTTATTTT 


ATTGTGGAAG 


10620 




CAAATGACGA 


aGCAGAAGCT 


AAACAATTTT 


TAACAGAAGC 


AGATATTAGT 


ATTCAATTAG 


10680 
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TTGATTACCT TGTAACTTGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 10 8 00 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 10 860 

5 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CGCACCTGAT GAAGAAGCGG 10 920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10 980 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGGCATG GATTTCGATT GCAGTTAATT 11040 

10 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AG CTG AAATT AATATAACGT 11160 

CGTTAATTGA ATAACGCTTA TGTTATAAGA GCACTCATAC CAAACCATAA TCATCTATAG 11220 

15 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 112 80 

TCCTATTAAA ATAGTAGGGA TTAAAAGGGG GCTTGTCATG ATTAAAATTC AACAATTACA 1134 0 

20 ACATCACTTT GGATCACATA AAGTAATTCA TAACTTTAAT TTGGACATTA GCAAGGGAGA 11400 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA ATATTATCGG 114 6 0 

TGGATTTATT CATC CAT CGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 11520 

25 ATCTCCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTGCCATGGA AAACGATTAA 115 80 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 1164 0 

TAAATTAGTT GATTTAGAAG ACAGGGGAAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 117 00 

30 GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTGCATAAG CCTAACGTTA TATTGATGGA 117 6 0 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 1182 0 

aCTAAAACAT AAAACGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 118 8 0 

J5 TTATCTTTCC GACCGCATTG TTCTGTTAGG TGAAGGGTGC AATATTATTT CTCAATATGA 1194 0 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 120 0 0 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 1206 0 

40 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 1212 0 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAG CAAGTG A 1218 0 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 12240 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT TAAATTCAAT AATTGGCCAG 12300 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC AT CAACTTT A ATAGAGCTAG 123 6 0 

CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CAT CATG AAG 124 2 0 

GCAATGTCAT TATGGGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGCGATG 124 8 0 
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GTAAACAATT AAAGATTAAA CCGGGGCATT 
TGCCAGCCGC ATTGAGTGAA CACAGAATTA 
5 CACTGGGTGA AAAGTTAGGC AAAGGTAAGA 

ATGCGTATTG CTGTGTGCTA GTACTGAGAG 
CGCAAgCATT TGTACAAGAT TATAAAAAGT 

10 

GTGTAGACAT TATGACGCAT CATTTTAAAC 
CATGGACATC CTATGGTGAT TTAACAATTA 
TGGTAAAACA ACATCATTTG TTTAATCCAC 

15 

TGTATAAGGA GGCATCGCGT TCATGACACG 
TATCACATTT ATTATTTTCT TAGGCATTTG 
ACCTGTATTG TTACCGGGTC CTGCTCTTGT 

20 

TGGAGAAATT TTCCAACATT TAGCAATTAG 
CGCATTGTTG GTTGCTATTC CATTGGGCTT 

25 CGCTATCGAA CCGCTATTTC AATTGATTAG 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT 
TTTTTTCCCA ATTGTGTTCA ATACTATTAA 

30 AAAAATAGCA GCAAATTTAA ATTTAACTGG 

CGGGGCATTT AAACAAATCA TGGCTGGGAT 
TTTAGTTTCT GGTGAAATGA TTGGTGCACA 

35 ACGAAATATG TTGAACTTAG AAGATGTTTT 

TTTTATTATT GATCGATTCA TTAGTTATAT 
ATAAGGAGAG ATGATGATGA CTTTAGAAAC 

40 

AGTAGAAGTT GATGAAGGGA CGTATTATCC 
TGGTTATTTC GGTGAGGCGG CATTGAGAAA 
GTCTTGTTTG ACAACAGGAT TTTGTTTATG 

45 

AAATGCCACG CAGC CACATT TAAATAATGA 
ATTAGGTGCT ACCGGATTGT CTAATCCGAT 
CCTTGAACAC ACTTATGTTG AT GG ACAATT 

50 

TAATATTCAA GAAGACCATT ATTTTGGTGC 

55 



TTAGCTATCA TGAAATGTCG CCAGCAGAAA 126 00 

CAGGGTATTC TGTAGCCGAA CCATTCGGTG 126 6 0 

CTTTGAAACA TGGTGATGAC GTTATACCTG 1272 0 

GGGAATTGCT TGATCAACAC AAGGATGTAG 12 78 0 

CTGGCTTTAA AATGAATGAT CGCAAGCAAA 1284 0 

AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 1290 0 

AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

TCCCACAAAT AACAAATTTA TATTACCTAT 13080 

GGAAATGGTC ATTATTATTG GGCATTACCA 1314 0 

AGGAAAAAGT ATATGGTCTT TCATTGTTAC 132 0 0 

TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 13 260 

CTTGCTTGGA AGGAATCGTT GGCTATACAA 13 32 0 

GCCGATATCT CCGATAGCAT GGGCACCATT 133 80 

GCCAGCGATT GCGATTATTT TTATCGCTGC 1344 0 

AGGCGTTAGA GACATTGAAC CTCAATATTT 13 500 

GTGGTCATTG TATCGCAATA TATTATTTCC 13 560 

ACATATGGCG GTAGGAACAA GTTGGATATT 13620 

ATCGGGATTA GGTTTTTTAA TCGTTGATGC 136 8 0 

AGCAGCAATA TTCTTTATCG GATTATTTGG 13740 

TGAGCAGTTT ATACTTAGAA GATTTGGTGA 13800 

GCTTATCAAA GAACAATTAG ATCCTCATTT 13860 

GAGAACATTT ATTCAGCAAT TATTTGTAGA 13 920 

AAATGCTGAA GTAATCGAAG CTGTATCGCA 13 980 

GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14 040 

CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

GAAGTCATTT AATGATTTAG AAAAGTTGAA 1416 0 

GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

GATTTCGAAA CATGAATCAT CAGATGAATT 14 2 80 
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10 



15 



20 



25 



TTTAGGAGTC AACGGGTCAG CAACGTATCA AATCACATTG AATCAAGTCG TAGTGCCACA 144 0 0 

AT CAC AAATT ATCACGCATG ATGCGAAGCA GTTTGCGGCA ACTATTCGCC CGCAATTTAT 14460 

TGCTTACCAA ATTCCAATAG G ATTAGG CTC AATTAAAAGT TCTTTAGAGT TAATTGATGC 14520 

AT7TTCAAAT GTGCAAAACG GAATAAATCA ATATTTAGAG TATGATGTTG AAGCTTTTAA 14 580 

AAAACGTTAT CGTCAACTTA GAGAGGAATA TTATGCAATA TTAGATGACG GTAACITAAC 14 640 

TTCACATTTA AATGAATTAA TATCATTGAA GAAGGACATC GGCTATTTAT TGTTAGATGT 14 700 

AAATCAAGCT TCTGTTGTCA ATGGTGGTTC TAGAGCGTAC ACACCATATT CGCCACAAGT 14 760 

TCGCAAGTTA AAAGAAGGAT TCTTCTTCGC AGCATTGACA CCGACATTAA GACATTTAGG 14 820 

TAAACTTGAA GCAGAGTTGA AGGGGTAAGT GTGATAAGCT GATTTTTTGT TTAGATGCGT 14 880 

TTGTTGAAAC ATTTTTTAAA ATAATATAAA TCTTAGTTTA TAAACATTTT CTGTTAATTT 14 940 

GTTATATCCT TTTAACTAGG AAAATATACA TTTCGTAATA ATAATAATCG TTATCATTGA 15000 

AAAAGTGTTA ATAAGGTGTA TAATGAAAAT GTGAACAATT AATGAACTTC TTATTTTAAA 15 06 0 

GAAGGTGAAT ACTATAGATA CGCATACTAA AGAACAACAA TTCTCGAATC TAGTAAGATC 1512 0 

TTATCGTAAA GAATACGTGG GTAAAGGACC CAATAGTATT CGAGTGTCGT TTAAAGATAA 15180 

TTGGGCGATT GCACATATGA CAGGTGTTTT GAGTAAAGTT GAGAGTTTTT ACCTAAACGA 15240 

CAAACGCAAT GAATCGATGC TCCATTATAC ACGCACAGAG AAGATTAAAC AGATGTATAA 153 00 

30 AGAAATAGAT GTAAATGAGA TGGAAAGTCT TGTAGGCGCT AAGTTTGTAA AATTATTTAC 15360 

AGATATTGAT TTGAATGATG ATGAAGTCAT TTCAATATTT G7TTTCGATA AGTCAATAGA 15420 

ATAAGTGTTG CTGGTGTAAG GTACACGGTG CTGTTTGCTA ACTTCGCTTT GAATTTAACA 15480 

35 ATAATTCAAG GGGGTGGTAT GTCAAACGGT GCCGTTTTTT TGTCATATTT TTAAAACAAG 1554 0 

CAACATGCAA CACGTACTTT AAGGAAGTCA AAATTTATCA TTTAGGAGAG ATGGATATGA 15600 

AAATCGTAGC ATTATTTCCA GAAGCAGTAG AAGGTCAAGA AAATCAATTA CTTAATACTA 156 60 

AAAAAGCATT AGGATTAAAA ACATTTTTAG AGGAAAGAGG ACATGAGTTC ATTATATTAG 15720 

CAGATAATGG TGAAGACTTA GATAAACATT TACCAGATAT GGATGTGATT ATTAGTGCGC 157 80 

CATTTTATCC TGCATATATG ACTCGTGAAC GTATTGAAAA AGCACCGAAC TTG AAATT AG 15840 

CAATTACAGC AGGTGTAGGA TCTGACCATG TAGATTTAGC GGCAGCAAGT GAACACAATA 15900 

TTGGTGTCGT TGAAGTTACA GGAAGTAATA CAGTTAGTGT GGCAGAACAT GCGGTTATGG 15960 

ATTTATTAAT ACTTCTTAGA AACTATGAAG AAGGTCATCG TCAATCAGTA GAAGGTGAAT 16020 

GGAACTTGTC TCAAGTAGGT AATCATGCGC ATGAATTACA ACACAAAACA ATTGGTATTT 16080 
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w 



15 



20 



TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 1620 0 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 163 80 

TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 16440 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16500 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTCC 16560 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 16620 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 16740 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACGCAATTAT 16 800 

TGAnAAATnT CATTCATGTG GnAATC 16 826 
(2) INFORMATION FOR SEQ ID NO: 47: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 47: 

TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 60 

35 ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 12 0 

TATAAAACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 18 0 

AGCTTAGCTA mCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 24 0 

ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 3 00 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 3 60 

ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 420 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG 4 80 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA mGGCGGAGTC GCTTTTGCTA 54 0 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 60 0 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 66 0 
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ATCAGGATGA TGATGAAATT AAGCCACCAT TTTTTATTCA ATGGGAAGAA AGTGATTCCA 7 80 

TGCGTACTAA AAAATTGCAA AAATATTTTC AAAAACAATT TTCAATTGAA ACTGTTATTG 84 0 

5 TGAAAAGTAA AAACCGATCA CAAACAGTAT CGAATTGGTT GAAATGGTTT GATATGGACA 90 0 

TTGTAGAAGA GAATGACCAT TACACAGATT TGATTTTAAA AAATGATGAT ATTTATTTTA 96 0 

GAATTGAAGA TGGTAAAGTT TCAAAATATC ATTCGGTTAT CAT AAAAG A C GCACAAGCAA 1020 

10 CTTCACCATA TTCAATTTTT ATCAGAGGTG CTATTTATCG CTTTGAACCA TTAGTATAAA 1080 

TATACGTAAG TGCTATGAGC GAGAATGCCC ATATGAATAA TGACAAGCAC AATGGAAAGA 114 0 

ATCGTTAATA TATTATTTAA TCGTGATGAC TTAATTAAAA TGAAAAAGAT TGATAATATA 12 00 

15 

AATGTGAAAA AGATAAGTAT AACCCGTAAA CTAAAGTAAT TCACGGTGAG AGGTTGACTC 1260 

AATGTCATAA TGATTGCAAC GATGTTCATA ATTATAAATA GACTTAAAAT AATTGTTCTC 13 20 

ATATCAAACA CCTCATTGTT AGATTATTGA CATTATAACA GGGGTAATTG TATATGAACA 13 80 

20 

«T"rp7v j^rn/~tmr*r>*r> t/^ <t»t«o a nn n AnMvnwpRTT n^TTPHJinTr 1 tv JvnnrpnnTTP a r P r r w r r r h r* A A A i a a n 

TGAATATCGT GTTAGATGAT GAAAGTATAT TGAAGTATAG GTAACTAGTT GAAAAGTATT 15 00 

AATTGTACGA TAACATTAAA TTTAACACGA AACATAGATA TAAAATGATT CACAATTAAA 1560 

ATGGGTAAAT TTGAACTTGC TAAACTATTA ATTGGAGCAT GGACATTTCA AAAATAAGAG 1620 

TTCAAATCTT ACACAAGCTC TGAATCGACA CTATAAGATA CAAACTGTAT AATTAAAGGT 16 80 

30 ATTGTTAAAT AGAAGGAGAT ATCATAAATC ATGGAAAAGA TGCATATCAC TAATCAGGAA 174 0 

CATGACGCAT TTGTTAAATC CCACCCAAAT GGAGATTTAT TACAATTAAC GAAATGGGCA 180 0 

GAAACAAAGA AATTAACTGG ATGGTACGCG CGAAGAATCG CTGTAGGTCG TGACGGTGAA 186 0 

35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA AAAGTACCTA AATTACCTTA TACGCTATGT 1920 

TATMTTCGC GTGGTTTTGT TGTTGATTAT AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

GACAGTGCAA AAGAAATTGC TAAAGCTGAG AAAGCGTATG CAATTAAAAT CGATCCTGAT 204 0 

40 GTTGAAGTTG ATAAAGGTAC AGATGCTTTG CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC TACATCCAAC CACGTATGAC TATGATTACA 216 0 

C CAATTG AT A AAAATGATGA TGAGTTATTA AATAGTTTTG AACGCCG AAA TCGTTCAAAA 2 22 0 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2 280 

ACATTTGCTG AGTTAATGAA AATCACTGGG GAACGCGATG GCTTCTTAAC GCGTGATATT 2 340 

AGTTACTTTG AAAATATTTA TGATGCGTTG CATGAAGATG GAGATGCTGA ACTATTTTTA 24 00 

50 

GTAAAGTTGG AT C CAAAAG A AAAT AT AG CG AAAGTAAATC AAGAATTGAA TGAACTTCAT 24 6 0 



EP0 786 519 A2 



5 



75 



20 



30 



35 



CAAAATATGA 


TTAATGATGC 


GCAAAATAAA 


ATTGCTAAAA 


ATGAAGATTT 


AAAACGAGAC 


2580 


CTAGAAGCTT 


TAGAAAAGGA 


ACATCCTGAA 


GGTATTTATC 


TTTCTGGTGC 


ACTATTAATG 


2640 


TTTGCTGGCT 


CAAAATCATA 


TTACTTATAT 


GGTGCGTCTT 


CTAATGAATT 


TAGAGATTTT 


2700 


TTACCAAATC 


ATCATATGCA 


GTATACGATG 


ATGAAGTATG 


CACGTGAACA 


TGGTGCAACA 


2760 


ACTTACGATT 


TCGGTGGTAC 


AGATAATGAT 


CCAGATAAAG 


ACTCAGAACA 


TTATGGATTA 


2820 


TGGGCATTTA 


AAAAAGTGTG 


GGGAACATAC 


TTAAGTGAAA 


AGATTGGTGA 


ATTTGATTAT 


2880 


GTATTGAATC 


AGCCATTGTA 


CCAATTAATT 


GAGCAAGTTA 


AACCGCGTTT 


AACAAAAGCT 


2940 


AAAATTAAAA 


TATCTCGTAA 


ATTAAAACGA 


AAATAGATTA 


ACGACTGAAA 


TCTGAACGCT 


3000 


CATAAGACTG 


TCATTTGCGT 


TCAGATTTTT 


TTACACAATA 


TAGAATGGTT 


GAGTAAAATA 


3060 


TTTTTGAATA 


TAGTGAAAGA 


GGGGGAAGTA 


CTGTGATAAA 


AAAGCTATTA 


CAATTTTCTT 


3120 


TAGGGAATAA 


GTTTGCTATC 


TTTTTAATGG 


TTGTTTTAGT 


TGTCTTGGGC 


GGTGTATATG 


3180 


CGAGTGCTAA 


ATTGAAATTA 


GAATTACTAC 


CAAATGTACA 


AAATCCAGTT 


ATTTCAGTTA 


3240 


CAACAACAAT 


GCCGGGTGCA 


ACGCCACAAA 


GTACCCAAGA 


TGAAATAAGT 


AGTAAAATTG 


3300 


ACAATCAAGT 


AAGAT CATTG 


GCATATGTGA 


AAAATGTTAA 


AACGCAATCC 


ATACAAAATG 


3360 


CTTCAATTGT 


AACAGTTGAA 


TATGAAAATA 


ATACAGATAT 


GGATAAAGCA 


GAAGAACAGC 


3420 


TTAAAAAAGA 


AATCGATAAA 


ATTAAATTTA 


AAGATGAAGT 


TGGTCAACCA 


GAATTAAGAC 


3480 


GTAATTCGAT 


GGATGCTTTT 


CCGGTTTTAG 


CATATTCATT 


TTCAAATAAA 


GAGAATGACT 


3540 


TGAAAAAAGT 


AACGAAAGTA 


CTGAATGAAC 


AATTAATACC 


AAAATTGCAA 


ACGGTAGATG 


3600 


GTGTGCAAAA 


TGCGCAATTA 


AATGGGCAGA 


CGAACCGTGA 


AATCACCCTT 


AAATTTAAGC 


3660 


AAAATGAACT 


TGAAAAATAT 


GGGTTGACTG 


CTGATGATGT 


AGAAAACTAT 


CTAAAAACGG 


3720 


caacSagaac 


AACGCCACTT 


GGATTGTTCC 


AATTTGGTGA 


TAAAGATAAT 


CAATTGTTGT 


3780 


TGATGGTCAA 


TATCAATCTG 


TTGATGCTTT 


TAAAAACATA 


AATATTCCAT 


TAACGTGGCA 


3840 


GGAGGACCAA 


GGGCATCTCA 


TCCCAAAGTG 


ACCATAAACC 


AAATTCAGCC 


ATGTCAGACG 


3900 


TTATCAGGCA 


TCACCACAGC 


AAATTCAAAG 


CGTCAGCnCC 


AATATATAGT 


GGATGCCGCA 


3960 


nGAACTAGGG 


GTTTAGCGnT 


ATCAGTGGTG 


TGGCGACTCT 


ATTCTAAACG 


AT 


4012 


(2) INFORMATION FOR SEQ ID NO: 48: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

CAATATAGGT CGCCGAGTTT CAACTaCATC AACTGGTTCA GTTACATTAG ATAATGCGCT 60 

5 AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 120 

TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 180 

ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCATTAG GCGTAGATAT 24 0 

10 

CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 3 00 

ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 360 

TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACTCACGTT GGTTTACAAG CTCGTTTAAT 4 20 

15 

GTCACAAGCG TTACGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 4 80 

CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGG 54 0 

TGGACGTGCA TTAAAATTCT ATAGTTCAGT AAGACTAGAA GTACGTCGTG CAGAACAGCT 600 

TAAACAAGGA CAAGAAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 6 60 

GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 720 

25 GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAGCATGGTA 7 80 

TTCTTACAAT GGCGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 84 0 

AAATCCACAA ATTAAAGAAG AAATTGATCG TAAATTGAGA GAAAAATTAG GTATATCTGA 900 

30 TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACGAAG AATAGTACAC 9 60 

AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT 1020 

TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 10 80 

35 AAGGTTTTTT ATTTTATTTA TTATTACATT ATCAATAGTT TTATAATCGA GCTTCAAAAC 114 0 

TTTAGAAAAT AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 1200 

ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 1260 

40 ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT GAATTTATTA 13 20 

AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 13 8 0 

CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 14 4 0 

45 

CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 1500 

AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 1560 

AGACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 1620 

SO 

GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 16 80 
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CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 1300 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA I8 60 

5 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 1920 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 19 30 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2 04 0 

10 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

GCTACATTTG AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGG CGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTGCGCAT 234 0 

2Q CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACGAGCTGGA 24 0 0 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 2460 

GGTGTAGAAT TAGCGAAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2520 

25 CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 2580 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 264 0 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 270 0 

30 GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2760 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2 820 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2 880 

35 TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA AtCTAGTTAG ACAGCACTTT 2 94 0 

ATCGGTAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGACGGACCT TATATTAAAT 3 0 00 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 30 6 0 

40 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 312 0 

TCTTCTCTTC yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TGCAATTGTT 3 24 0 

45 

GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 330 0 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3 360 

5o GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 34 2 0 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 348 0 
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GGATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 3 6 00 

CG CAATTGAA ACGTACATAC CTCAACTGAA GCAAAAGTAT AAACCAACAG TTACAATTGT 3 660 

AAATGCTGAA AATGCAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3 72 0 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3 780 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC JB4 0 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3 90 0 

AGGAAGAGCG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3 960 

GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CATGCAGAAA CAACTTCTGA 4 020 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 4 080 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGGTATAT AACGGATGTT 414 0 

GGTATGACAG GTTTTTATGA TGGCATTTTA GGAATAAATA AAACAGAGGT AATTGAGCGT 4200 

TTTATCACTA GTTTGCCACA AAGACATGTT GTTC CZAAA'IXj AAGGTAGAAG TGTATTATCT 4260 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4 320 

AATGATGACC ATCCATTTTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 4 3 80 

CTATCGTCGA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 444 0 

CTTTTTGTTA TCATTTAATA TGAAATATAT CCATAGGAGG CATATAACTA TGAAACCACA 4 500 

ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT GAATCAACTG GGGAAATCTT 4 560 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4 6 80 

35 TAGTGATGAT TTAGATATTT TGATTGCATT TGACCAAGAA ACAATTGATG TTAACCATCA 4 74 0 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4 800 

AGGATGTCAT GCACAGCTTA TTGAATTACC TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4 86 0 

AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 492 0 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4 98 0 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 504 0 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CAT CT AT AT A TGATTGGTAA 510 0 

CGATGCCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 522 0 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 
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TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCCAAC GAGGTGGACC 54 CO 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATG CAAATG A TTTATGGTAC 54 6 0 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 552 0 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 55 8 0 

10 TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 564 0 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 57 0 0 

CAAGCGTTAT GCGTtAACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 576 0 

75 AGGTATTCAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA 7CGGTTTTAT 594 0 

20 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6060 

25 TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC 63 00 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 63 60 

AG AAC CTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 

35 

TATTAATTCT TATGGCGTTC ATTCTATTCA CGGACGTGCA TTACCTTTAG CTCAAGGTGT 64 80 

AAAAATGGCG AATAAAGATT TAACTGTTAT TGCATCGGGA GGAGATGGTG ATGGTTATGC 654 0 

TATAGGTATG GGG CATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

40 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 67 80 

AACAAAACTA ATTGAAGATG cAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

50 

TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6 960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AGAGATATCa AAATTaCTGA 70 8 0 
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TGTATTTATA ACAGATCCAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 720 0 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 726 0 

5 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTGGTTTGCA 7 32 0 

TTTTAGAGTA GGTCCGTTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 73 8 0 

AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TTAGCCAAGT 7 44 0 

w 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7 500 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7 560 

'5 TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7 620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 768 0 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 774 0 

20 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 7 77 8 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AGATGAAGTT GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA TAGAAACAGG 6 0 

35 TAATCCATTC TTTCAAACAT CACATAGTGG TTGTGCGACG GGCGGATCCT GTAATTGTTC 120 

ATTATAAAAA ACATCGAGTC AGAAAAAGGT GGTTATTGAA cCACTAACTA GCATCTGACT 180 

CGATGTTTTT ATTTATTCGG GATTGTTTGT TTGAATTGTT GTGCTAAATC TGGTCGATCT 24 0 

40 

GTCACAATCG TGTGTGCACC TTTTTGGTAT AAATCATTCA TCAGATTTAT ACTATTTACG 3 00 

CCATAATAGC CTGGAATGAT ATT CAT AT CA TTTAACCATT TGATAAAACG AGATGAAGTC 3 60 

45 AAATCAATGC CTTTAAAATG AGTAGGCATT TGGAACGTTT GTGCTAATGG TTGGTAGTAC 42 0 

CTACCACCTA ATAAATGATA TTTTAAAAAT GCTTCTGTAA CTTCCTGTTG GCTAGCACCA 4 80 

ATTGCGACGG ATCCTTGTGC AATTTTATTA AAACGAACGA TTTGTTCTTT ATAAAAACTT 54 0 

t0 GTCACAAGAA CGCGGTCAAA TGCTTGATTT TCTGCAATTG TATCAAACAT AATTTGTGGT 6 00 

.^nrr,- ATAGGA TTCAGGAGCA TCTTTTAAGT CTACGTTTAT ATACATATCA 660 
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AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA B4 0 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 9 00 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 96 0 

CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 102 0 

GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 112 8 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 6 0 

GAT CATC CTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 12 0 

GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 18 0 

30 AACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 24 0 

TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 3 00 

ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 3 60 

TTAATTTTAT CTTGTTGCTT TTTATTAACA T CACCGG CAT ATTTTGTTGG CACGTCGACA 42 0 

ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 480 

ATTGTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 54 0 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 6 00 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 6 60 

45 TGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 7 20 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 84 0 

50 

CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 90 0 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATT CATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 10 2 0 
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TCTATCAATA ATGCATCATT TTGGACGTTG TTAAGGATAG CTTTATCTAT AAATAACTGC 114 0 

ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGCATAAT AATTT CGTT C 120 0 

ACATACTTTT CTTTCTCAAT ATC ATTTTT C ATATTGATTT GTTTGCGAGA GGTACATACT 126 0 

TTAAGCATTA TCGCACATCT CGTTGTATAT ATTAAGTTTA T C AT AACATG ATTTTATGT C 132 0 

GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAGATAC TGTCAGTGAA ATGAATGAAA 1380 

CTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGCTG CATTTGCACC AGCGCCCATT 144 0 

GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCGC CAGCAGCAAA TATTCCAGGA 150 0 

'5 ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTTCGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAGATACC ATCTAAGTTA 1620 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 1680 

CCGACAACTT CAGTAOTTTT GGCATTTGTT TTGATATCAA CATTTGATAA AGAACGTAAA 174 0 

CGATCTTGTA ACACGTTGTC TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GGCAGAATGC AACACCTTTA 1920 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC AGTAGCAATA 1980 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA 2 040 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTGCAT CAATGTCATA TTGATCAATG 2100 

TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC 2160 

TCAATACCAG CAGTATCATT AACTTGGCCA CCGATACGAT CAGCAACTAT ACCAGTACGT 2220 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC AC CAACGATT 228 0 

AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 24 00 

GCAGGGACTG CCATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 2 46 0 

TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2 52 0 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTTCTAA TTTTTTAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 2 64 0 

ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 2 7 00 

— AC-*^r;AT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAG AG AT AG T 2760 
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TGTTGTTTTA 


AATCAGCATT 


AAGCATGGTT 


GTAATGCCTC 


CTTAGATTTT 


ACCTACTAAA 


2940 


TCTAAACCAG 


GTTGCAATGT 


TTTAGCGCCT 


TCTTCCCATT 


TAGCTGGGCA 


TACTTCGCCA 


3000 


GGGTTTTTAC 


GAACATATTG 


AGCTGCTTTG 


ATTTTGTGAG 


CTAATGTACT 


AGCGTCACGG 


3060 


CCAATTCCGT 


CAGCGTTAAT 


TTCAGATGCT 


TGTACAACAC 


CGTCTGGGTC 


GATAATGAAT 


3120 


GTACCACGTT 


GAGCTAAACC 


AGTAGCTTCA 


TCTAATACAT 


CAAAATTACG 


AGTGATTGTT 


3180 


TGTGATGGGT 


CACCAATCAT 


AGTGTAAGTG 


ATTTTGCTAA 


TTGCATCTGA 


ATGGTCATGC 


3240 


CATGCTTTGT 


GTACGAAGTG 


AGTATCAGTT 


GATACTGAGA 


ATACATTTAC 


GCCTAATTTT 


3300 


TGTAATTCTT 


CATATTGGTT 


TTGTAAGTCT 


TCTAATTCAG 


TTGGACAAAC 


GAATGAGAAG 


3360 


TCAGCAGGAT 


AGAAGCATAC 


TACGCTCCAA 


GAACCTTTTA 


AATCTTCTTG 


TGTAACTTCT 


3420 


TTAAATTGAT 


CTTT1TTTGG 


ATCGAAArCT 


TGCGCTGTAA 


ATGGTAAGAT 


TTCTTTGTTA 


3480 


ATTAATGACA 


TAAATATCTT 


CCTCCTAAGA 


ATTTAAGTAT 


GAATTAGAAC 


TATCAATTGA 


3540 


TTGCGCTTAA 


TTATAATAAT 


TCTAATCTCT 


TAGTTAGCAT 


TATTACATTT 


TGATCCAGAA 


3600 


TAGTCAACTG 


GATAACTTTG 


TAAAGTGAAT 


GATTACTTTT 


AAAATAAAGA 


AAGATAATAT 


3660 


AAAGTGCTTT 


GATAATGGAT 


TTTGTAGTTG 


ATGATTTAAA 


AGGTTGTGTC 


TATATTTAAT 


3720 


ATCTTGATTT 


TAATGTAAAA 


AATGTAAAAA 


AAGAAGATTT 


GTATTCTCAA 


CTAAGTCAAC 


3780 


CTTATTGATA 


ATGGTATGAG 


AATATTTGTT 


CGAGATGGAT 


GAAGGTAATG 


AGTGAGAAAC 


3 840 


TGGATTTTTA 


AAGTATGAGA 


CAATATTTTA 


AAAAGTTCAA 


TTATTAACTT 


ATAAGCAAAT 


3900 


AATTGCTATA 


AAAAAGTTTG 


GACGTGTACA 


ATTGCAATAT 


GAAGATTTTA 


AATTAATTGT 


3960 


AAAGTATCGA 


GGAGTGGGTA 


ACGTGTCAGA 


ACATGTATAT 


AATCTTGTGA 


AAAAGCATCA 


4020 


TTCTGTTAGA 


AAATTTAAGA 


ATAAACCTTT 


AAGTGAAGAC 


GTTGTTAAGA 


AATTGGTAGA 


4080 


AGCTCGACAA 


AGCGCTTCGA 


CGTCAAGTTT 


CCTGCAAGCA 


TACTCAATTA 


TTGGTATCGA 


4140 


CGATGAGAAG 


ATTAAAGAAA 


ATTTACGAGA 


AGTTTCTGGA 


CAACCTTATG 


TTGTAGAAAA 


4200 


TGGCTATTTA 


TTCGTCTTTG 


TTATTGATTA 


TTATCGTCAT 


CATTTAGTTG 


ATCAACATGC 


4260 


TGAAACTGAT 


ATGGAAAATG 


CATATGGTTC 


AACGGAAGGT 


TTGCTAGTAG 


GTGCAATCGA 


4320 


TGCAGCATTA 


GTTGCCGAAA 


ATATTGCGGT 


AACTGCTGAA 


GATATGGGGT 


ATGGCATTGT 


4380 


CTTTTTAGGA 


TCATTAAGAA 


ATGATGTTGA 


ACGCGTTCGA 


GAAATTTTAG 


ACTTACCTGA 


4440 


CTATGTCTTC 


CCGGTATTTG 


GTATGGCAGT 


AGGGGAACCc 


GCAGATGACG 


AAAATGGTGC 


4500 


AGCCAAGCCA 


CGCTTACCAT 


TTGACCATGT 


CTTCCATCAT 


AATAAGTATC 


ATGCTGATAA 


4560 


GGAAACACAG 


TATGCACAAA 


TGGCAGATTA 


CGACCAGACA 


ATCAGCGAGT 


ACTATGATCA 


4620 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 4 74 0 

AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 4 8 00 

5 GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCAGCA TTATCATTTG AATCGAAAGT 4 8 60 

ATCTTTATCC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 4 92 0 

CACGTTTAAT GCTGTTCTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 4 9 80 

10 

TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC 5 04 0 

ACCGACGCCA GCAACGCCGA ATGAACTAAT AATCACGACA GCGATTAACG TTACAATAAA 5100 

15 TTGTAAATCA ATTTCTACAT TAGCGACGGG TGCGACCATA ATTGCAAGCA TGGCAGGGTA 5160 

AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT GTCGCAGCGA AATTGGCAAT 522 0 

ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCCGC 52 80 

20 GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 5 340 

TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 54 00 

AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 54 60 

25 

TAATGTGTTG GCCATAATTG CTAATACACC GTATGGCGTT AAACGTAAGA CGAACGTCAC 5 520 

AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAGCAATT CACCATGATC 5 5 80 

AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TATCACGACA 5 64 0 

30 

GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 5 700 

TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCGCTTGTTT AGCAATTTCG 5 760 

35 CTTCCACGTG CTTGTTCAGC GTTACCAAGG TTAATTGTTG ATGCATCTAA ACCAAACACC 5 820 

AAGG CATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TGCCAATTAA AAAGATAAAA 5 8 30 

ATGASACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 5 94 0 

40 

ATAGAAATGA AAATTAAAGG CATAACAATC ATTTGCAACA ATGCAACGTA ACCTTGTCCG 6 00 0 

ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 6 060 

TGCAATAACA CACCGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 612 0 

GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA AAATCACCAA TACAATAATA 6180 

TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 6 24 0 

so GACTAGTATG CT 6 252 

(2) INFORMATION FOR SEQ ID NO: 51: 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 6 0 

w GCTCTTTTTT TACCTCTTGT GGGTTGAAAA aTGGATCATC AGAGATAGAC TTCTTCTTTT 120 

TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 18 0 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 24 0 

15 ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 3 00 

TTGGCGTTAT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 3 60 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT TTTAATTTCT 42 0 

20 

TCAGCTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 4 80 

GATATATAGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 54 0 

25 ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 6 00 

AAACGATTTA TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 66 0 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA ACAAAAAGAA 72 0 

30 ATTTTAAATG AAAAATATGG ATTAAATGAT CCTGtAGCTA CGCAgTATTT ACATTATTTA 780 

AAAAATGTTG TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATCA ACCTGTGTGG 84 0 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTCaTC 90 0 

35 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 96 0 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT TGTACTTGCT 102 0 

GTACfTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

40 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT AGCAACTGTC 1140 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 120 0 

45 AGAGCTAAAG GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG AAATGCTTTA 126 0 

ATTCCAATTA TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG CACTTTAACA 132 0 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 13 8 0 

50 

AATGATTTCT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 144 0 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 156 0 

55 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 1S80 

AGTTAAAACG AAATAAGTTA GCTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 174 0 

5 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 18 00 

GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 1860 

10 AAGATGCAGA TGGCAAGGAT GCTTATAAAG CAGCAAATGC TAAAGAAAAT TATTGGTTTG 192 0 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT CAAATTTCAT 1980 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTGCGA 2 04 0 

75 TTTCTGGATT CTTCGGTGGA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCGTAATTT TATTTGTATT AATTTTTGAA CCATCCATTT 2160 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 222 0 

20 

gagaattttt ajlaattaaaa aatcaagagt ttotoatgg c ttcgaaaaca ttgggggctt 2280 

caaaattcaa attgatattt aagcatattt tacctaatac attaggtgct atcgtggtta 234 0 

catcaatgtt tacagtacct agtgctattt tcttcgaagc atttttaagt ttcattggta 24 00 

25 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 24 6 0 

TATTAATTTA TCCACATGAA TTATTTATAC CAGCAATGAT TTTAAGTTTA TTAATTCTAT 2 520 

30 TCTTTTACTT ATTTAGTGAT GGATTACGTG ATGCATTTGA TCCGAAAATG CGTAAATAAA 2 580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA 77AGAAGTAA ATGATTTGCA TG7TTCCTTT 254 0 

GATATTACAG CAGGGGAAGT GCAGGCAGTG AGAGGCGTAG ATTTTTATTT GAACAAAGGG 2700 

35 GAAACATTGG CAATTGTTGG TGAATCAGGT TCAGGTAAAT CTGTAACAAC AAAAGCAATT 2760 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGGG 2 82 0 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AGATATTTCA 2 880 

40 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2 94 0 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3 000 

GAAATACTAA ATCTTGTAGG TTT AC CAAAT GCAGAAAAAA GATTTAAAGC ATATCCTCAT 3 060 

45 

CAATTTTCAG GTGGACAAAG GCAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 

AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT 3180 

50 TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3 24 0 

GA7TT AGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGT CAAATG 3 3 00 
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GGAGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 BO 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 3 540 

5 

TTGTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 36 00 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 36 6 0 

10 CGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 372 0 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 37 80 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 384 0 

15 TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 390 0 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAGACC 3 96 0 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAT AGTAGCTGAA GGTATTGATA 4020 

20 

TCCATCATTT AGCAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 408 0 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 414 0 

GCCAACGTAT TGGaATTGCC CGTGcATTAG CCGTTGaACC AGAATTCATT ATCGCGGACG 4 200 

25 

AACCAATATC GGCATTGGAT GTTTCAATCC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4260 

TACAACGTGA AAGAGGCATT ACGTTCCTAT TTATAGCTCA TGATCTATCA ATGGTGAAGT 432 0 

30 ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT AGTTGAAATT GGACCGGCAG 4380 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 444 0 

AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4 500 

ATAATCATTT AAGACAATTA CATGAAATTA GACCGAATCA CTTTGTCTTT AGTACTGAAG 4560 

AAGAAGCGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 462 0 

ATGCAATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCATTAA 4680 

40 

GTGGTTGTGC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 474 0 

TGTCATCAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4 800 

45 TGAcTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCGGTGTTAG 4 860 

GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4920 

GAAGCGATGC TAAATGGAGC AATGGTGACA AAGTGACTGC ACAAGACTTT GTTTATGCTT 4 980 

60 GGAGAAAAAC AGTTGACCCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 504 0 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 510 0 

TAAATGATGA AACATTACAA ATTGAATTAG AAAAGCCGGT TC CAT AT ATT AATCAATTAT 516 0 
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w 



20 



ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC 528 0 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 534 0 

TAGATAAAGT GAATTATAAA GTTATTAAAG ACTTACAAGC CGGTGCATCA TTGTATGATA 54 00 

CTGAATCAGT AGATGACGCA TTTATTACTG C AG AT CAAGT AAATAAATAT AAAGACAACA 54 6 0 

AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCGCA CAAGCAATAG ATAAAAAAGG 5 580 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 564 0 

?5 AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 5700 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 5760 

AGTGACATTT TCAATGAACA CAGAAGATAC ACCAGATGCA AAAATATCTG CTGAATATAT 5820 

GAAATCGCAA GTTGAGAAAA ATTTAC CAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 5880 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGGTTGGTC 5940 

TGCAGATTAC CCTGATCCTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

25 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6 060 

GGCACTTCAA CCGAACGAAC GATATGAAAA CTTGAAAAAA GCAGAAGAAA TGTTCCTAGG 6120 

30 AGATGCACCG GTAGCACCAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT 6180 

AAAAGGATTA ATT t AC CAT A AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 624 0 

TAAATCGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 6 3 00 

35 GGAGACATAT CTCCAGTCTT TTTGTGTTGG ATAAAAaCTT TGGGAATAAA AATTTAAAAT 6 360 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 64 20 

TAAACTTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 64 80 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6 54 0 

TGTTGAGTGC TTGCGGAAAA AGCAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6 600 

CGGAAACCTC AAAACATAAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAAGTG 6 660 

GTGTTTATTC TTCGTTATTA AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6 720 

ACGAAAGCTT 6 730 
(2) INFORMATION FOR SEQ ID NO: 52: 
SEQUENCE CHARACTERISTICS: 
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25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 6 0 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAATATC ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyG 18 0 

w TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 24 0 

ATTTGTATAG CAATAAATAT AAAAATGGGA GGCTATGTTC AATGAGCAAT ATGAATCAAA 3 00 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATT CG AT CCA CAAAAGAAAG 3 60 

TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCTCTTG 4 20 

GGTTAGAAC C TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 480 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATACAGCGAG CCATTTTGTG CTAATTTTTG 54 0 

CGCGTAAAAA TGTAACGTCA AGATCACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 6 00 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATGCATTC CAAGCAGATT 6 60 

TCCATATTTC TGATAATGAT CAAGCCTTGT ATGACTGGTC AAGTAAACAA ACGTATATTG 72 0 

CATTAGGCAA TATGATGACG ACAGCCGCAT TGTTAGGTAT TGATTCATGT CCGATGGAAG 78 0 

GTTTTAGTCT GGATACAGTG ACAGACATTT TAG CAAAT AA AGGGATCTTA GATACTGAGC 84 0 

30 AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 90 0 

AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 960 

ACCGTATGTC TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 1020 

AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGG^ACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 114 0 

TATGTGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATCCATAA CTTTGTGAAT 1200 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 126 0 

TCGTTGACAT AAATAGCGCA CGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 132 0 

TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 13 8 0 

CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 144 0 

CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 

50 TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 156 0 

CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 162 0 
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aAGTCTGACG GcCGTCTTCT AATAAATGTA ACGTTAGAGT ATGGcCACCA GTCCCAACAG 174 0 

ATAATACGGT TGTATTATCG TCAGAACTTT TAACGGATAG TCCTAAAATG TTTTTGTAAA 1800 

5 

ATGTTGTCAT TAAGTCTAAG TCTCTTACGT TCAGTACAAT GTTTGTCACT TGTGTTGCTG 1860 

TTTTATCGTG AAATGCCATT ATGCATCGCC TCTTTTTCTA TTTTT CTATA AGTTAGTATA 192 0 

AAAAGTATAC CAGAAAAGAA AATGAATTGA TAGCATAAAG TTTGAAATGC AAAATAACTA 1980 

w 

GTCGTTTTGC AATTTTAtAT TGATGCGAAC AAAAAAGCGA TGGTACAGTT GCACCATCGC 204 0 

AAAATTTATT TAACCAAGAT ATACATCTTG ATATGAATCT TCTTTTTCTA ACATATGTTT 2100 

,5 GGCAAATGAA CATGAGGCAA TAATTTTCAA ATTATTTTCT CGAGCGTGTT CAACAACTGc 2160 

TTTAAGTAGT TTTTTGCCAA CACCTTGACC ACCAAGTTCA TCAGATACGC CTGTATGATC 2220 

AATGTTAATT TCATTATTAT CCACAAAACG GTATGTGATT TCAGCTAAAG CATTATTTTC 2280 

20 AT CAT CA CCA ATATAGAATT TGTTCTCGCC TTGTTTGATT TCAAGGTTAC T CAT A CAT AT 2 34 0 

CAACTCCTAT CATGATTGAT TATAGTATTT CCCTATTCTA TTTTAACTTA AACGAAGTCA 24 00 

AAGGTGCATG ACAGTCATGT GACGACATTG CCACATCTAT GTAGTCGTTT TTATTAAGCA 2460 

25 

CAGTTTGAAA TGAAGATGAA AACACGTATC TTGACATTAA ATCTATTCAG CTATATAATT 2520 

TATCTCGAAA TCGAAATAAA ATAAAAAAGT TGGTGATCAT ATGGATCGAA CGAAACAATC 2580 

TCTCAATGTT TTTGTCGGAA TGAATAGGGC GTTAGACACA TTAGAGCAAA TTACAAAAGA 2640 

30 

AGACGTAAAG CGATATGGCT TAAATATTAC TGAATTTGCA GTGCTCGAGT TGCTTTATAA 2700 

TAAAGGTCCG CAACCAATTC AACGTATTAG AGACCGCGTA TTAATTGCAA GTAGCAGCAT 276 0 

35 TTCATATGTT GTAAGTCAAT TAGAGGACAA AGGTTGGATT ACACGTGAAA AGGATAAAGA 282 0 

TGATAAACGT GTATATATGG CTTGTTTAAC TGAAAAAGGT CAAAGTCAAA TGGCAGATAT 2880 

TTTCCCTAAG CATGCTGAGA CATTAACAAA AGCGTTTGAT GTGTTAACAA AGGATGAATT 294 0 

40 AACAATCTTA CAACAAGCGT TTAAGAAACT AAGTGCACAA TCTACAGAAG TGTAAGGCGT 3 000 

GCACTAAAAA TTTACATTAA AGTATCTCGA TTTCGAGATA AATGCACTAA AAATATAAAG 3 06 0 

AGGGTATATA AAATGATAAA TAATCATGAA TTACTAGGTA TTCACCATGT TACTGCAATG 312 0 

45 

ACAGATGATG CAGAACGTAA TTATAAATTT TTTACAGAAG TACTAGGCAT GCGTTTAGTT 3180 

AAAAAGACAG TCAATCAAGA TGATATTTAT ACGTATCATA CTTTTTTTGC AGATGATGTA 324 0 

GGTTCGGCAG GTACAGACAT GACGTTCTTT GATTTTCCAA ATATTACAAA AGGGCAGGCA 3 300 

50 

GG AACAAATT CCATTACAAG ACCGTCTTTT AGAGTGCCTA ACGATGACGC ATTAACATAT 3 360 
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TTAAATGAAG GGGTAGCACC TGGTGTACCT TGGAAGAATG GACCGGTTCC AGTAGATAAA 3 54 0 

GCGATTTATG GATTAGGCCC CATTGAAATT AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 3 60 0 

ATTTTAGAGA CTGTTTACGG TATGACAACT ATTGCGCATG AAGATAATGT CGCATTACTT 36 6 0 

GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 372 0 

GCaGCACGTC AAGGTTATGG tGAGGTACAT CATGTGTCAT TTCGTGTGAA AGATCATGAT 378 0 

GCAATAGAAG CGTGGGCAAC GAAATATAAA GAGGTAGGTA TTAATAACTC AGGCATCGTT 3 84 0 

AATCGTTTCT ATTTTGAAGC ATTATATGCA CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3 900 

ACAGATGGAC CAGGATTTAT GGAAGATGAA CCTTATGAAA CATTAGGCGA AGGGTTATCC 3 96 0 

TTACCACCAT TTTTAGAAAA TAAAAGAGAA TATATTGAAT CGGAAGTTAG AC CTTTT AAT 4 02 0 

ACGAAGCGTC AACATGGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 408 0 

GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 4140 

GATTTATTAC CGTTAGGCGA AgcATTGAAT GAAAATTATC ACTTGTTAAG TATTAGAGGA 42 00 

CAAGTTTCAG AAAATGGGAT GAACCGTTAT TTCAAACGTC TTGGTGAAGG TGTTTATGAT 4 26 0 

GAAGAAGATT TGGCATTTCG TGGACAAGAA TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4 320 

CGTTATGATT TTGaTATTGA AAAAGCAGTA CTTGTTGGAT TTTCAAATGG ATCAAATATA 43 30 

GCGATTAACT TAATGTTGCG TTCAGAAGCA CCATTTAAAA AAG CATTGTT ATATGCACCG 444 0 

TTATACCCAG TTGAAGTAAC GTCAACAAAG GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4 500 

ATGGGGAAAC ATGATCCAAT TGTGCCATTA GCTGCAAGTG AACAAGTCAT TAACTTGTTT 4560 

AATACACGTG GGGCACAAGT CGAAGAAGTT TGGGTGAAGG GCCATGAAAT TACAGAAACT 46 2 0 

GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT TCTATTAAGA AGCGGACAGA 468 0 

TGGAAAAGAT TTTTACTTTT CATCTGCCCG CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 474 0 

40 TTTACAATAG TATAGATATT TTAATCGATA TGAGATTTGC CGGTAATACG CTTAATTAAA 4800 

CCTTTATAGA GTACAGGTAT GAGTAAGATG AAACCGAACA ATCCCATAAT AGGGAATACT 4 860 

TTTCCAATTA ATGAAATGAa ACCGATAAAT GTACTAATAT AAGTGATGAC AGCCATTGTA 4 920 

ATAATAATGA TGAAGTAACG TCTGCTGAAT GGAACGCTGA AACGTGACGC AAATGCATAC 4 980 

ATTAATCCAA CAACAGTATT GTAGATGACA AGT AT CAT AA TGACAGACAT AATAATACCA 504 0 

ATTGACGGAG ACATTTGTGT CGCTAATTTT AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 

TCGAATTGAG AAATTAAACC TAGATTAATC ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 

CCAATCAAGC CCCCGTATAA CGTTGAGTCA CGATATTTAA CTTTACTACC CATCACTGAT 5220 

55 



30 



35 



45 



50 



EP0 786 519 A2 



5 



10 



15 



20 



25 



30 



35 



CCAGGTGATA 


ATGATTTCTG 


CTTATGAATC 


TGAGCATCAT 


TATTAGCGGC 


AGTAAAATCA 


5340 


AGATGACTTG 


TTGTGAAATA 


GTAGACCGCA 


ATCATAATGA 


CAATCGCAAT 


TAAAAATGGG 


5400 


GTAACACCGC 


CAAGCACAGC 


AATTAAACGA 


TCGAATTTTA 


GAAACAGTGT 


TG CTAAAAT A 


5460 


AAGGCGACTA 


ATATGAGTGC 


GCTCAGCCAA 


TACGGTAAGT 


TGAAACTTTG 


ATGAATGGTT 


5520 


GACGCACCAC 


CTGCAGTCAT 


AATAATAGCT 


AAAGACAACA 


TAAACATTGT 


TAAAATAATA 


5580 


TCAAAACCTC 


TTGCAATAGA 


GGGGTATAAG 


AAATAGTTAA 


TTGAATCAGA 


ATGATTTCTG 


5640 


GACTTTAGAT 


GATGACCTGT 


ATGCATGACA 


ACCATTCCAC 


CTAAAGTAAT 


CAATAGTCCT 


5700 


GTTACAATAA 


TGCCTGAAAT 


GCTATATGCG 


CCATGACTTG 


TGAAAAACTG 


GAAAATTTCT 


5760 


TGACCAGTAG 


CAAAGCCGGC 


ACCAACGACA 


ACACCAACAA 


AGGCAAATGC 


CACAATAATG 


5820 


GACTCTTTTA 


AGATACGCAT 


GATTTAAAAA 


TGTCCCTTCG 


TAATTTTAAG 


TAATATAGAA 


5880 


AATGTAACAT 


ACATGTTAAT 


GAAAAATATA 


GTACTAATAT 


AGTATTTTGT 


TAAATTGCAG 


5940 


TAGAAGCGAG 


GGTGTCGGTC 


ATTTCATTAA 


TTTATTAGTT 


GATTTTGCAT 


TTTTTTGCTG 


6000 


TAAAGTTGTT 


ATAATACAGT 


TAACAGGAAT 


TAGCATAGAT 


ACACCAATCC 


CCTCACTACT 


6060 


CGCAATAGTG 


AGGGGATTTT 


TTTCGGTGTA 


GCTAGGTCGC 


CTATTTATCA 


TCGTGTTTGC 


6120 


GTAgCaATGC 


GTAAACACAG 


TACCACTAAA 


TAAGTGCACG 


ATACATGCAT 


CAAATGTCGT 


6180 


CTTTAGTcTA 


AGTAACGATC 


ATGCATTAAC 


ATTTTCAAAA 


TATCTATTTG 


AG CTTGAAGA 


6240 


TCTTTACCAA 


TATTGGTATC 


ACGAATCTTC 


TTACGTTGTA 


ATTCTTTATC 


TACGACGCGC 


6300 


TTTATAGAAA 


GTTCATCGAT 


ACCTTCGGAA 


AGTATTTTTn 


CTTTAGCGTT 


AAATTGTTGG 


6360 


TGTGCAACGA 


GTTGCATACC 


GAATGAATTA 


TACAATAGTG 


TATAGCCTGC 


AATGCCAGTn 


6420 


GTTGACTGAT 


AAGCTTTTGA 


AAAGCCACCA 


TCAATGACAA 


GCATCTTTCC 


ATCAGCCTTG 


6480 


AT - 












6482 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16552 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



txi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

50 

ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 6 0 
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w 



20 



25 



AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 24 0 

TCATTAATAA GTATCACATT AAACATGATA CATGAATCGA TATTTCATTT AAGACACTGC 3 00 

ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 360 

ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAACACAT CCCATGTTTA 420 

ATCACAATCT TTGGTGCACT GCGTGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 4 80 

CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCGT 54 0 

GACATkwnTA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GCACGTAAAA 6 00 

75 GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 650 

AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 72 0 

TTAAAAGGTA ATCGACTATT CTATTTAGCA ATGGCACCAC AATTCTTTGG CGTTATTTCT 780 

GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTATCGAA 34 0 

AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 900 

TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 960 

ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 1020 

TCAAACATCC AAGTTACATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 1080 

GAATCAAGTG GCGCGCTAAA AGATATGGTG CAAAACCACA TGTTACAAAT GGTTGcATTA 114 0 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 1200 

GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 1260 

CAATATGGCG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGATCGC 132 0 

GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 1440 

40 CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

GTTAGATTCA AACCTATTAG TAATCAATAT CCAACCTAAT GAAGGTGgTA TCTTTtACAT 1560 

CtAAATGcTA AGaAAAATAC ACAAGGTATC gAAACAGrAC CTGtCCmATT GtCTTACTCm 1620 

ATGaGCGcTC aAGaTAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 1680 

CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 174 0 

GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 180 0 

GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 1860 

GGACGATATT CAATAATTGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 192 0 
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TATATTATGA AATTATATTT TACAATGCCC AAAACTATTT TAATAATCAT TGAACAAATG 2 04 0 

GGTGTATAAT TTATAGAAAT AATGTAGAAT AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

5 AAGTTTTGGA CGTTATCAAG CAAATACAAC AGGCAATTGT TTATATTGAA GATCGTTTAT 216 0 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG ATTACGTTGG TCTTTCGCCA TACCATCTTG 222 0 

ATCAATCATT TAAAATGATT GTCGGCTTAT CTCCAGAAGC TTATG CACGC GCGCGTAAAA 22 8 0 

w 

TGACACTCGC TGCAAATGAT GTGATTAATG GTGCTACACG ACTTGTAGAT ATCGCTAAAA 2 34 0 

AATATCACTA TGCAAATTCA AATGATTTTG CAAATGATTT TAGTGATTTT CACGGCGTAT 2 4 00 

15 CACCTATTCA AGCCTCTACT AAAAAAGATG AATTACAAAT TCAAGAGCGA TTATATATCA 24 60 

AATTATCAAC TACTGAGAGA GCACCTTATC CATACAGATT AGAAGAGACA GATGATATTT 252 0 

CATTGGTTGG ATATGCACGA TTTATAGACA CTAAGTATTT GTCACATCCT TTTAATGTTC 2580 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG GTAAAATTAA AGAGTTACGA CGATATAATG 264 0 

ACGTTAGTCC ATTTGAACTA TTTGTTATTA GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2700 

TTGTAGGTGT ACCAAGTGAA CGTTATCCTG CACACTTAGA AAGTCGATTT TTACCTGGCA 2 76 0 

25 AACATTGTGC GAAATTCAAT TTACAAGGTG AAATTGATTA TGCAACTAAT GAAGCTTGGT 2 82 0 

ACTATATTGA ATCAAGTTTG CAGTTAACAT TGCCATATGA ACGAAATGAT TTATATGTTG 2 880 

AAGTGTACCC TCTCGATATT TCATTTAATG ACCCATTCAC TAAAATTCAG CTTTGGATTC 2 94 0 

30 

CTGTTAAACA GAGTCCTTAT GACGAAGATT AAATAATAAA AAACAAAGAA GCCCCCTAAT 30 0 0 

ATATCTATAG GTCTACAAAT GGCCTTAGAT TCTATTAGGG GGCATATTAA TATGTTAATT 3060 

TAGTTCGATA ACACATGCTT CATATGGACG TAACTGTTTT AAATTAACTT TGGCATCATA 3120 

35 

ATTAAATAGC TTTACTTCTC CATGGCTTAA ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

GTTAGTAAGA TTACCTACAA TAAGAACTTG CTTTTCATTT AATGTTCTCG TGTACGCAAA 324 0 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC AAATTGACCA TATACGTATA CATCATTAGA 33 00 

CTTTCTTAAT TGAATTAAAT CTTTATAAAA TTGTAATACT GAATGCTCAT CTTCTAATTG 33 6 0 

TTGTGCAACA TTGATAGTTT TATAATTCGG ATTCACTGGG AACCACGGTT CACCATTTGT 34 2 0 

4 ~ AAATCCTCCA TTTAACGTAT CATCCCATTG CATTGGTGTG CGAGAATTAT CTCGGTTCTC 34 8 0 

ATCTTTATAT TTCGCAAGTA AAGCGTCTAC ATCTCCACCT TGAGCTTTCA CTATTTGATA 354 0 

GTCATTTTTA ACAGCAACAT CGTTAAACGT TTCAATACTT TCAAATGGAT AATTCGTCAT 36 0 0 

50 

ACCAATTTCT TGACCTTGAT AAATGAATGG CGTACCTTGT TGCAAGAAAT AAACAGCTGC 36 6 0 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 3 84 0 

CAGTCCCAAA TGTTCAAATT GGAATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3 90 0 

GTCATCAGCA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGTCATAA TGTCATACTT 3 96 0 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATACCTGGCT GATTCATATC 4 02 0 

T AC AT CAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 4 0 80 

CGTCTTCTTA ATATGCGTAA TTGCATCTAC TCTAAATCCA TCAATGCCTT TATCAAACCA 414 0 

CCAGTTCATC ATTTCAAATA CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 42 0 0 

TTGTTTTTTA CTGAATAAAT GGAAATAATA TTGCTCAGTA TTAGCATCAT ATTCCCATGT 42 SO 

AGATCCATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCATCTGGCT TTGGATCTTG 4 320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 43 80 

ATGTTCATCA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 444 0 

AACACCTTTT AATAAACGAT CAAAGTCTTC CATCGTTCCA AATTCATCCA TAATCTCTTG 4 500 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4560 

25 AATGACATCG ATACCGAAAT CTTTTAAGTA GTCCAATTTA TCAATCATTC CAGGTAAATC 46 2 0 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TATACTTGAT ATGCTACTGC 46 80 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TATCGCTGTG TTGATTTTCT 4 74 0 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 4800 

TATATTTAAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4860 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 492 0 

ATCTTTCGGA ATTTCAATAT TAAGTTCATA TAGGACACTT AAAATCGCTA AATGTAACAT 4 9 80 

AG CATCT AAC GAAATGATTG CCTGTTTAAT ATTTGGGTCC TTCAAACGCG TATGTAGATT 504 0 

TTG CATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 510 0 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TTAATTCATC TACACCTTGT TCAATAACAT GTCGTGTCAA 522 0 

45 ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 5280 

TGTAGGCTTA CCAATCACAA TAAATGGCAT GCTTTCATCA ATTAACATTT GTTTAATCGG 5340 

AT CATTTTCT TTTGAATAGA GCAGTATAAA CGCATCAACC ATTCGTTGTT TAATCATTTT 54 00 

ATAAACTTCA TCCATTAAAT CATTCATATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 5460 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 
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AGTTCTAGCA GCGGTATTAG GAAAATAATT CAATTCTTCC ATAACTTTCT TCACTTTTGA 5640 

AATTGTCGCT TCGCTAATAC GTTGATTTCC TTTTATAACT CTTGAAACTG TCGAAGGAGA 5700 

AACACCGGCT TTTAGTGCAA CATCTTTAAT CGTAACCATT TAATCACCTC CTGTTAATTT 576 0 

CTGCATCGGA AAACGCTTCC AAC C ACTGT A TAATACCAGT TTAGTCACAC TTTCTAAAAA 5820 

AG7CAAAAGA TTTGTGCAAA CGATTGCATA AAACGATAAA AATAAAACCT TCATACTGAA 58 80 

ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT GTTTAGTCAC 594 0 

TAACTGCAAA ATAGTTACCT TGGCCATCTT GAAAATTAAA TACACGTTGA CCATTCATTT 6000 

CTACTATATC ATGCCCAGTT AAACCTAAAT CATTTAATTT TGAGTATAAT GCATCAAAGT 6060 

TTTTCTCTTT pj^Qp^pj^ GATGGTGTTC CTAGGTTCAC TTCCGGGCTA TGCTTTTCAA 6120 

TAAATTCTTT TGCCATAATC GTCAATGACG TTTCAGCATC TTTGGTAGGT GATACTTCAA 6180 

20 CTGCAACATA GTCCTCAGCT AACGGTGTTT CACTTACAAC AACAAATTCT AAAGTTTCTG 6240 

TCCAAAATGC TTTCGCTTTT TCGACATCAT CAACATATAA CATAACTTGA TTTAACTTTT 63 0 0 

CCATAAAATA GTACCTCTAT TTCTCTATAG TACATGCTAT CATAACACAG TAAATATTTT 6360 

25 ATTACTTCAC AAAATGCTTA AAAATATGGC GGGATGCTTT TAAGGTCAAG GATAATACTT 64 2 0 

GTGTAATTTT TTATAGGTTG TAGCTACTCT AT CACACTCT CTTTTATATT TATCAAAAGA 64 80 

TATAAAAAAG GATAGTATCT TTCAACTATC CTTTAATCAA TATTATTCTT CAATCCATTG 6 54 0 

TGTATGGAAT ACGCCtTCTT TATCTTTTCT TTCGTACGTA TGAGCACCGA AGTAGTCACG 660 0 

TTGTGCTTGA ATTAAGTTTG CAGGTAAATC AGCAGCACGG TAACTATCAT AGTAATTAAT 666 0 

ACTTGATGAG AAACCAGGTG TTGGTACACC ATTTTGAACA C CAGTTGCGA CAACATCACG 6720 

TAACGCATCT TGATATTCAG TAACGATGTT TTTAAAGTAA GGATCTAGCA ATAAGTTTTG 67 80 

TAATCCTGGA TTATTATCGT AAGCATCTTT GATCTTTTGT AAGAATTGTG CACGGATAAT 6 840 

GCAACCTTCT CTCCAAATCA TAGCTAAATC ACCAAGTTTT AAATTCCATT CATTATCTTC 6900 

ACTTGCTTTA CGCATTTGcG CGAAACCTTG TGCATAAGAA CAAATTTTAC TCATATATAA 6 960 

TGCTTTACGA ATTTTTTCTA AAAAGTCTTT CTTGTCACCA TCAAATGATG CTTTTGGACC 702 0 

45 ATTTAATTCT TTAGAAGCAT TTACGCGCTC TTCTTTGaTT GAAGAGATAA AACGTGCAAA 7 08 0 

TACAGATTCA GTAATGATTG TTAATGGAAT ACCTAATTCT AATGCGTTAA TTGAAGTCCA 714 0 

TTTTCCTGTA CCTTTTTGaC CTGCAGTATC AAGAATTTTT TCAACTAATG CTTCTTTATT 7 200 

TTCATCTAAT TTCATGAAAA TATCACCAGT GATTTCAATT AAATAACTTT CTAATTCACC 7 26 0 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TCCAATATAA GTAACACATG AAGCACCGTC 7440 

TTTTGCCTTT GCAGCAATTG CATCAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTCTTG 75 00 

TCCACCCGGC ATTAATGACG GACCAGTTAA CGCTCCAATT TCACCACCAG AAACGCCCAT 7560 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TGCTTTATTA CGTCTGATAG TATCTTGATA 7620 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 76 8 0 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT AAAATTTTAC GTGGTTTTTC 7740 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA ATATTTTTCC CTTTTGATTC 7 8 00 

TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG 78 60 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGCT AAACCAATAA CTCCAATTTG 7 920 

TTGTGTCATA TTACTTACCT CACTTGTTGA TTTTTCATTA GTATTGTATC ACAAAATAGA 798 0 

CATACACTAC ACTAAATCAT TTCGAATGTC GCGCAACTAT TTTGATTATT TCTAACACTT 804 0 

GACTTGCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

25 AACCGCCACC AGAAATAATT GTATTTGCAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 8220 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGATAACTT TGCTCCACTG 8 2 80 

TAACTACTGC TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATCA GTCATATGTT 834 0 

TGACTTGTGT TTTTATTCTT TCTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT 84 00 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GCAGTAGTTA 84 60 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCGCAATA TTAATAGCAC 852 0 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8580 

ACATTTGCGT CGGTGCACCT ACAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 8 64 0 

AACCAAAGTC CGCGTCCAAC AACTCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 87 00 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 87 6 0 

CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG 8820 

45 TACCATCAGA GTATATGTAG CCGTCATCTT TTACAATTGG CTTTACATTA ATTGCGGGTA 8 8 80 

CAACAGTATC CATATGGCTC GTCAAATATA ATTTAGGTAC TTCGCCTTCT TCGATAGTAC 8940 

TATTCATTGT ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT 90 0 0 

50 CTTTAACATC TAACCCTAAT GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 906 0 

CATTCCCTGT CTCAGAATCG ATTTGTACAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 912 0 
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GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGGTTT GTAAATTGTG 9240 

T CAT ATT ATT TTCAATTTAT TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA 93 00 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 9360 

CTCATGTTTT IT AT TAT ATT CCTTTATGAT GATTGCTAGT TATATCGTCT CAAGTTAAAA 94 20 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 94 80 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 954 0 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAACTATATC TCAACTACCT ATACAAAATC 9600 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTACTTTCA pj^Qj^GGGTC ATACGGATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 97 80 

20 ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 984 0 

TTCAAACGCG TCTCATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA 990 0 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 996 0 

25 ATACTTTCGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCATATCA 10020 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCGGAACGCG 10 080 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAATCGC AATGGTATCT 10140 

TCATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT 102 0 0 

TTACATTTGT ATTCGTCtAG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GCAATGCTTA GTCCAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT 10320 

GATTTTGCAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 10380 

TTTAATACGA CAGTTAATGG TATAAATAAC AATACGATAA TACCGAGTAC AATTGGACTC 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA T CG CTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 1056 0 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 1062 0 

ATACCCACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

ATCTTTTACT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 10860 

_______ ^^^^ T ^ A ^ ^ TT ^ TPrAr;r AAATAOTGT^ A^TAGACOAT CAGGTAATAr : 0920 
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AGAATTGATC ATAACTAGTG TTGTACCATC TTGTTTAAGA ACTTTGTCAA CATCTTCTGC 11040 

AGTAGTTAAT TGCTCATATC CCGCAGATTC AATTTCATTC CTTGCTTGTT CTACAACACC 1110 0 

5 

GTTCATGTAT AAATCGAAAT TCATGnCCAT AAGTTCAATC ACCTATCCCT TTATATTTAA 11160 

ACTAtCCTCA TTCTACTAAT TAATAACATA TTGTTCAATA AACTAATCTG AATCACACCT 11220 

ATATTTAGAC ACAATTTTAA CAATATACCA AACATTATTG TGCTTAAAAT CATGGTAACT 11230 

10 

AATTTGTTCA CATGTTTTCA TTAATATGTT TCAAGTATGA TGTCTTATTT TGACTTTACT 1134 0 

GCAAAAATGC ATTCAACCAT GTTGATTATT GTTCTTTATC TTTTTTGAAT ATATTGCACA 114 0 0 

15 TATTTTAGTG CCAAAAAATA ATACATCCAT CGACAAGAAC AAGATAAAAC AAGTTGTCGA 11460 

TAGATGCATC TATGTTATCA CTAATATATA TTTGTATTTT CTAAAGTATA CTGTTCGATA 11520 

CGCTGTTTAA TATGATTCAT ArATTTACCT GTTTGTAAAC CATCTAAAAT ACGATGATCA 11580 

20 ATTGAAATAC ATAAATTAAC CATGTTACGA ATTGCAATCA TATCATTAAT TACTACTGGC 11640 

TTTTTAACGA TTGATTCTAC TTGTAAAATC GCTGCTTGTG GATGATTTAT AATACCCATT 11700 

GATGATACTG AACCAAATGT ACCAGTATTA TTTACCGTAA ATGTACCGCC CTGCATATCT 117 6 0 

25 

TCAGCTGTCA ATTGCTTATT ACGCGCTTTC GTTGCTAAAG TATTAATTTC TCTAGCTATA 11820 

CCTTTGATTG ACTTTTCGTC TGCATGCTTA ATCACAGGTA CGTATAATTT ATTTTCATCA 11880 

GCAACAGCAA TTGAAATATT AATGTCTTTA TGTAAGACAA TTTCATTTCC TTGCCAGCTA 11940 

30 

CTATTTAATA AAGGATATGC TTTTAAAGCA TCTGCTACAG CTTTTACAAA GAAAGCAAAG 12000 

AACGTTAGAT TATATCCTTC TTTATTTTTA AAGCTGTTTT TATAATGATT TCTCGTATTC 12060 

35 ACAAGATTTG TAGCATCTAC TTCAATCATC ATCCATGCAT GTGGAATCTC TGTTACACTA 1212 0 

TTAACCATAT TTTGCGCAAT TGCTTTACGC ACACCATTTA CTGGTATTGT GCTGTTTTCA 12180 

CTATTGTCTT CAGATGATTG GTTACTTGAT GTATCTACTG ATGTTGATTT TGTTTGAACT 12240 

40 TGTTTGTCAG ATTGAGCTGT GGTACCACCA TTTTCAATAA CTGACATTAT ATCCTTCTTA 12300 

GTTACACGAC CTTCAAATCC ACTACCTACA ACTTGTGATA AATCAATGTC ATGCTCTGAA 12360 

GCGAGTTTAA ATACAACAGG TGAAAAGCGA CCATTATTAC GTGGTTGATT TTGTTTAGCA 12420 

45 GTAGATGTCT GTTCCACTGT TGCACTAGCT TTTTTAGTAG ATTTCTGAGT ATGCTCATCC 12480 

ACTTTTGCTT GTATCTCTTC AGTTGTTTCA TTTGTCTTTT CATCAGCAGT TTCAATTTTA 12540 

CAGATAATTG TATCAATAGC TACTGTCTGC CCCGCTTCAA CTAAAATTTC TGTAATTGTT 12600 

50 

CCTGATATCG TGGAAGGGAC TTCAGCTGTC ACTTTATCTG TAATAACTTC ACATAATGGT 12660 

T CAT ATT CAT CAATATGATC AC CAACAGAA ACTAACCATT GTTCAATGGT GCCTTCATGA 12720 
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AATTCACGCA TTTTATTTAA GATTTTTTCT GGATTCATCA TAATTTCATT TTCTAATACA 12 840 

GGAGAAAATG GCATAGATGG TACAtCTGGA GCAGCTAAAC GCATGATTGG TGCATCTAAA 12 900 

5 

TCGAACAAGC AATGCTCTGC AATAATCGCT GACACTTCTG ACATAATACT ACCTTCTAAA 12 960 

TTATCTTCAG TTACAAGTAA AACTTTACCT GTATGTTTAG CACGATCAAT AATTGTTTCT 13 020 

TTATCTAATG GATAAACAGT TCGTAAATCA ACGACTTCAA CATTGATACC GTCTGCAGCT 13 080 

w 

AAAATATCCG CTGCTTGTAA ACAATAATTG ACCATTAATC CATAACAAAA TACTGTTAAA 1314 0 

TCTTCACCTT CACGTTTCAC ATCTGCTTTT CCTAAAGGTA CAGTGTAATA TTCTTCTGGC 13 2 00 

15 ACTTCTTCCT TTAAGAAACG ATAAGCTTTT TTATGCTCAA AGTACAATAC TGGATCATTT 13 260 

GATTCGATAG ATGATAATAA AAGCCCTTTA GCATCATACG GTGTGGAAGG AATAACAATT 13 320 

GTTAAACCTG GCGATGAAGC AAATATACTT TCAATACTTT GTGAATGATA TAGTCCTCCG 13380 

20 TGAACACCGc CACCAAATGG TGCACGAATC GTTAATGGGC ATTGCCAATC ATTATTTGAA 134 4 0 

CGATAACGCA TTTTCGCAGC TTCACTAATA ATTTGATTTG TCGCAGGTAA AATAAAATCT 13 500 

GCAAATTGAA TTTCTGCAAT TGGTCTTTTA CCTACCATAG CTGCACCAAT GGCAGTTCCA 13 560 

25 ACAATATTTG ACTCAGCTAA TGGCGTATCG ATAACTCTGT CTTCACCATA TTTTTGTTGC 13 62 0 

AGTCCTTGAG TAGTACCAAA TACGCCACCT TTTCTACCAA CATCTTCACC AAGAATAAAC 13 680 

ACATCTTTAT TTTGTTGTAA TGCTAAGTCT TGTGCCtGcG TATCGCCTCT AAATAAGATA 13 740 

30 

ATTTAGCCAT TAGTTAAGAC TCCCTTCTTC GTACACAAAT GCATAGGCTT CTTCGACACT 13 800 

TGGATATGGC GCGTCTTCAG CAGCCTTTGT CGCTTTATTG ATGATGTCTT TnATgTCCGC 13 860 

TTCTATTTCT GCCAACCAAG CATCATCGAT AATGCCAGCT GAAAGCAACT CTTTTTTGAA 13 920 

35 

CTTTTCATTG CAGTCTGCTT TTTTAAGcGT TTCACGCTCT TCTTTCGTAC GATATTGGTC 13 980 

GTCSTCATCT GATGAATGAG CTGTCATACG ACTTGTTACT GCTTCAATCA AAGTTGAACC 14 040 

40 TTGACCAGAA ATAGCTCGAT CTCTTGCTTC TTTCATCGCT TTATACATTG CTAATGGATC 14100 

ATTACCATCT ACTTGTTCAC CATGTATACC GTAACCAAGT GCTCTATCCG AT AATTTTT C 1416 0 

AGCTGCGTAT TGTAATGAAT CAGGTACTGA AATTGCATAT TTATTATTTA TAATGACACA 14 220 

45 TACAAAAGGA AGTTTGTGTA CACCCGCGAA GTTTAAACCT TCATGGAAGT CACCTTGGTT 142 8 0 

TGAGCTACCT TCACCAACAG TTGCTGTTGC AATTTTCTTC TTACCATCCA TTTTTAAAGC 14 340 

TAAAGCAGCA CCAACAGCAT GGGGTATTTG AGTTGCTACC GGTGAACTTT GAGACAAAAT 14400 

50 

ATTCTTAGCT CTACTACTAA AGTGTGATGG CATTTGTTTT CCACCAGAGT TAACATCGTC 14460 
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15 



25 



AATCTGAGTT GCTTCTTGTC CTTGACCACT TACAACAAAT GGAATTTTAC CTGCACGGTT 1464 0 

CAATAACCAC AGTCTTTCAT CTATTTTTCT ACCTAAATCC ATCCATTTAT ATATTACTTT 14700 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT ATAATCAATC ATGTTAAATC CTCCTATTTA 14 760 

TACGTGAATA GCTCTACTTT CTGCTTTCAA TCCTAATTCC ATCAACACTT CAGAGATGGA 14 82 0 

AGGATGTGCG TGTGTTGTTA GTCCTAATTC TAATGCCGAG CCATTCATGA ACTGTAACAG 143 BO 

TGATGCCTCA TTAATCAATT CTGTTACATG TGGACCAATC ATATTAATAC CCACAATTTC 14 940 

TTCAGTTGAT TGATCAATCA CCATTTCGCT ATACCCTTCG TTTGTGTCAT GGCTATCAAT 15000 

CACTGCTTTA CCAATTGCTT TAAATGGTAC TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

CTTTGCTTGT TCAATGTTTA AACCGATAGA AGCAATTTCA GGTTGTGAAT AAATACACTT 15120 

AGGCATCATG TTATAGTTTA CTGGGATTGG GTTCCCCTCA AACATATGAT CAACAGCCAC 1518 0 

20 AACACCTTCT TTTGATCCAA CATGTGCCAA TTGTAATTTT CCTATACAAT CACCAGCTGC 15240 

ATAAATATGT TTATCTTCAG TTTGTTGAAA TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

AAGtTTTATT TTAGTGTTGT TTAAACCAAT ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

TAGCAACACT TTATCTACTT TAATTATGTC TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

GTTAACATTT ATATCATTTT CAGAAAGTTT TATTCCCTCA TAGAATTTAA CACCACGTGC 15480 

TGACAATGAT TTTTTTAATA GTTGTGAAGC TTGTTTACTT TCAGTTGGTA AAATTCTTTC 1554 0 

ACCTGCTTCT ATAACTGTTA CGTCAACACC TAAATCTATC ATCAATGATG CAAATTCCAT 156 00 

TCCGATAACA CCACCACCAA TAATACCAAT ACTTGATGGT AACGTCTTTA ATGATAATAT 15660 

ATCATCGCTA GATAAAATTT TATCATGATC AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

AGAACCAGTT GCAATTAATA CAAATTGGTT GGGTAATAAG TCTGATTCAC CATCTTCATA 15780 

TTCGACAGAA ATTGTGCCAC TTTGAGGTGA AAATATAGAT GTACCTAGAA TACGTCCCGT 15 840 

40 GCCATTATAA ATGTCAATGT GATTGTGTTG CATTAAATGC TTTACACCTT GATACATTTG 15900 

ATTAATAATG TCTTCTTTTC GTGCCAACAT ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 

ATCAACGCCA AACATTGCTG CCTGTTTTAC TGTTTGAAAT ACTTCAGCAG ATTTAAGCAG 16 020 

CGATTTAGTA GGAATACAAC CTTTATGGAG ACAAGTACCT CCTAATAGTT GTCGTTCTAC 160 80 

TATTGCCACT TTTTTACCTA ATTGAGACGC ACGTATCGCA GCAACATATC CTGCAGTACC 1614 0 

TCCACCGAGA ACGACTAAAT CATATTGTTT CTCTGACATG TTCTTACTCC TAACTAATGA 16200 

TATATATCCA TTGAAAATTT ATTAATACAT AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 

CATGATTGTC TATTTAGTTT GAATGCACAT AAATAAATCC ATAAATGAGT ATTCAACACA 1632 0 
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w 



25 



TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 16 440 

AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16 500 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 16 560 

TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 16 5 92 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

20 CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 60 

TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAGAnTTT 120 

TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 180 

ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 24 0 

TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 30 0 

AGATTAAATA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 36 0 

TTAATCAAAA TAAATAAAAT ACAATATAAC ACTATTAAAA GCATATTCAC AGATACTTCC 42 0 

AAGATTCTCA AACCAAGAAA ATTTTTCTTT TAAATTTAAA CAGATTTACC TCTTGATAAA 4 80 

ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 54 0 

CTTTTGTTTT TATATTATTT CAATACCCTA CTATATATCA CAACACATAA ATTAAGCATG 600 

ACAOTCATTC AATTTAGTTC ACCATTTCGT GTTCCAATTT TACTGAGTAT CATGCTTTTA 6 60 

40 ATGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCTCGCATAC TGTCATCTTT 720 

CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTCCAACGT CATAACGTTC 78 0 

GCCTTCGAAG TCATATGCAT ACACTTGGTT ATCATTATTC ATACGTTCAA TCGCATCTGT 84 0 

4 ° TAACTGAATT TCGTTACCTG CGC CTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 9 00 

CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 96 0 

TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 102 0 

ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 10 80 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTCCG TTGAATACTG 12 6 0 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCATTTCTA ATTCTTTTTG 13 20 

ACTATCAAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 13 80 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGGTTTAT CTAAGATAGG 14 4 0 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 150 0 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 156 0 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TTCATAGTGT CATTGAGTAT 162 0 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGGCTTACTT CATATAATTT 168 0 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTGCGT 174 0 

AATAACGCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT 1 BOO 

TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT TTCTTAGATT GTGCTTTTTT AGTTGGTACC 1860 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 198 0 

25 GGTACCTCTG GCGTTGGCGG TGTTGGTGTT TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 2 04 0 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 216 0 

AGTGTATCTT CTTCAAAGTC AACACTATTG TGTCCACCGA ATTGATAATT TGGTTTATCT 2220 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CTGTCGAAGT CGATATCAAT GATATTACCA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2 340 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 24 00 

AAAXCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 2460 

TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

GTTCCTAAAC CAGAATGAGA AATATGATGA TTGTTTTCAG TAATTTCCTC GATTGGTCCT 264 0 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

GTATATTCTT TCGTATCTTC AATTGTTGTA TGATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGG TTAGACTCAG TAGTAACCTG ACCACCACCT 282 0 

GGGTTTGTAT CTTCTTCATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

GATTCTTCAA AGTCTACATG AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 294 0 
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TCTTCGATTG 


TACCAGTCAA 


TTCATGCTTC 


TCCACTGGCG 


GCTCTGATTT 


AAATTCAAGT 


3060 




TCGATAGGAG 


TACTATGTTC 


TATAATAGGT 


TCCTTTAGTT 


TATCTTTGCC 


GTCGCCTTGA 


3120 


£ 


GCGTTATTAG 


AGTAAAATGC 


AACGCCATTT 


TTCCaAGTTA 


AATTACTTGT 


ATAATAATAG 


3130 




TTATAATATC 


CAAAAAGGTG 


TGTTTGAAAT 


TCTAAGTTGC 


TAGCATTTGA 


ATCATAATAC 


3240 


10 


CCTTCATATT 


TTATTACATA 


ATTTTTACTT 


TGGTCTAAAT 


TATTAAAGTT 


TAAAGAATAA 


3300 




CCACCATTAG 


TATCAAAATC 


TAAACTCATA 


TTATCAGTCA 


CATCTTCAAA 


TTTGCTGACA 


3360 




TCATCAAGCT TTGCATAnTn AgctTTCAGC 


TAAATCGTCT 


GAACCAATGT 


GTTTATATAC 


3420 


15 


CTTAACTGTT 


GGATTATTAA 


CCCCTGGTTT 


ATTTCCTTTA 


GTTACTTGAC 


CAGTTACTGT 


3480 




CACAGAGCTT 


AACGACTGGT 


TGTTAGGTTT 


CATQTACGCA 


AAATGACTAA 


ATTTCCCATC 


3540 




TACTTTATTT 


AAAGTATCAA 


TTCGACCATT 


AGCTGTTACT 


CCCCAATTAT 


CTCTAACTCC 


3600 


20 


ACCTAAATAT 


TGAATATTAA 


ATATTTTGCT 


AACCGTAGTC 


TCACCCAATT 


TAACTTCAAC 


3660 




ATTTTGGTTA 


CCTTTTTGCG 


TCACTGTTGT 


AGGATCAATA 


AATAGATTTA 


AAGATAATTC 


3720 




AGCAGTTAAA 


TCTTTCTTTT 


CTTGTACATA 


TTCTTTAAAC 


GT AT AT CT AA 


CTTTTCTTTC 


3780 


25 


TCCAATTATT 


TCTCCTGTCG 


CCATAACTTG 


ACCATCTGTA 


CTTTTTATCT 


CCGGAACTTT 


3840 




ACGCAGTGTT 


GAGATACCAT 


GAGTTTCAAC 


ATTATCGCTT 


AATGTGAAAT 


CAAAATAATC 


3900 


30 


TCCCGCCTTA 


ATTCCTTCTC 


CAAATTTCCA 


TTTATATTTC 


AAGGTTACTC 


TTTCTGCGTT 


3960 


ATGAGGATTT 


ACAACATTCG 


TATCTTGTTT 


ATGTCCTACA 


ATTTCACTAC 


CTTCTTCTAC 


4020 




TTCCACTTTA 


TTTGTTACAT 


CTGTACCTGT 


CGCTTTAGTT 


TCTTCCACTA 


CTTCTTTCTC 


4080 


35 


TGCAACTGCT 


GTAACGTCAt 


TGatCTTTTC 


ATTCTTGGTT 


TAATTTCTGA 


GACGTTACTT 


4140 




GGTTGAGCTA 


TGTCAACTTG 


AGTTCCTGTA 


GTTTCCTTAT 


CAGCAACTTT 


TTCCGATGGC 


4200 




AAATCAACTC GCGAAgTTTC 


TACTTTTGGT 


GCTTGCAcAG 


TTTTCGGTGC 


TTCTTCTGTT 


4260 


40 


GTTACTTGTG 


TTGATTGTGA 


TGGTTGCTCA 


GTTGATGTCG 


CGCTGTATGA 


TTGTGTTTCA 


4320 




TCTATTGTAT 


TAACGTTATT 


TGTAGTTGTT 


TGTGTTTCGC 


TTGCTTTACT 


TTCAGTAGCT 


4380 




GAACTCCCAC 


TTTCCTCTAC 


TGTAGTATTG 


TTTTGTTCCG 


ATGCTGCAGC 


TTCTTTTTCT 


4440 


45 


TGTCCCATTC 


CAACAACGAT 


CATTGTTCCT 


AAGAATACTG 


AGGCCGCTCC 


CAATTTGTGT 


4500 




TTTCTTATGC 


CGTATCTAAG 


ATTGCTTTTC 


ACTATAATAT 


TCTCCCTTAA 


ATGCAAAATT 


4560 


50 


CATTTATTTT 


TAAAACTCAA 


TAAATGCAAT 


TCTATATTGT 


TCGGTTTTTA 


AAAGCAATGA 


4620 


AAAAAAGCGA 


GTTAATAAAA 


AGTTAAGATT 


GTTGTTAACT 


TTATGTATAA 


TGAGTTTTTT 


4680 
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TACTAAACCA TACATAATAA TCGCCTGTAC AATGCATCAT TAACAAGTCA CTGAAACGCC 4 850 

TTTCATTGTA TTAATAACGT CACTATAATT TTTATATCGT TCGGTTTTTG TTTGATTTTA 4 920 

5 

ATGATTATTT ATACAAAAAC AGCCGTATTT CAAGCCGACA TTTTAAATTT AACTAAATTT 4 930 

GCATCTAGTT AATAATTGCA TTTATCAAAT TTGTCTTATT GATCCAATCT AATTTGTACT 5 04 0 

CACAAACTAG TTTAAAATTC TAACTTTATC TCTCAGTTCG TTATCAATCA TCAGACATAA 510 0 

10 

ACCAATGAAG CAATCAGAAA ACACTCTAAT TTTCTATTAG AAATTTGATT TAATATAAAA 516 0 

AAACAGGCTT ACTTCATATA ATTTATGAAA TAAACCCGTC AATTTTTGTT TAATTATGCT 5220 

15 TTGTGATTCT TTTTATTTCT GCGTAATAAT GCTAAACCTA GAATGCTGAA TAATCCGCCG 52 80 

AACAACATAC CTTTGTTTGT TGATTCTTCT CCACCTGTTT CAGGTAGTTC AGATTTCTTA 53 4 0 

GATTGTGGTT TTTTAGTTGG TGCCACTGCT TTAACCTTTT CATTGATTTC AATAACAGGT 54 00 

20 GTTACTACTT TACCTTGTTC CACTGGTTTA GAAGGCTTTT TAGGTTCTTC TTTGGCAGGT 54 60 

GGTACTGGTT TACCAGGTTC AGCTGGTACC TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 5520 

TCACTCGGCA CTTCTGGTGT CGGTGGTGTT GGTGTTTCCG GCTCACTTGG TACTTCTGGT 5 5 SO 

25 GTTGGTGGCG TTGGTGTTTC CGGCTCACTT GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 564 0 

ACGATTGGAG GTGTTGTATC TTCTTCAATC GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

TTTGGAAGTG TATCTTCTTC AAAGTCAACA CTATTGTGTC CACCGAATTG ATAACTTGGT 5760 

30 

TTATCTTTAT TTGTATCTTC TTCAATAATT TCAGTGTGCT TATTGAATCC GTGAATATGT 582 0 

GGCACACTGT CGAAGTCGAT ATCAATGATG TTACCGCCAT GTTCATACTT AGGTTTGTCT 5880 

TTTTCTGTAT CTTCCTCGAA TGACTGATTA CCTTTATTTT GACCATGAAT TTGAGGTACA 5940 

35 

CTATCAAAAT CGaTATCTAC GATATTGCCA CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 60 00 

GTGTCTTCCT CGAATGACTG GTTACCGCTA TTTTGGCCAC CTTCATAACC TAATTCACTC 6060 

40 TTAATATCAA CGTGGCTATT TTCTTCGATT TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TTTTCAGTTC CTAAACCAGA ATGAGAAATA TGATGATTGT TTTTAGTAAT TTCCTCGACT 618 0 

GGTCCTTGTG CTTGACCATG CTCTTCAGGT AATTCATCCA CTAATTCAAT CAGATTACTT 624 0 

45 tCAGTTGTAT ATTCTTTCGT ATCTTCAACT GTTGTATGAT CGCTCACtGC GCCAGTTACA 6300 

ATACCTTTTG TAGACTCTTC GTCAAATTCA ACTAAGTTAG ACTCAGTAGT AACCTGACCA 6360 

CCACCTGGGT TTGTATCTTC TTCATATTCA ACAACATCAG CGTGATGTTT TGAATTTTCA 642 0 

50 

TGTGTAGATT CTTCAAAGTC AATTGGATTT GATTCCTCAG AGGACTCAGT GTATCCTCCA 64 80 

ACGTGACCTG ctTCGCTATC CACAGCAGTA TGGTAATCGA TATCAATAGC TGATGAATCC 654 0 
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TGGTAATCAA TGTCAAGAGT TGATGAATCA TATTCCTCTT CAACAGTAGT TACTAAATTC 6660 

TTATCATATT GACCTGTAAG AGTTTCTTTA ATTGTATCTT CTTTATATTC AAATTTATTA 6720 

5 

TTTTGAATAA TCGGACCATT TTTCTCATTT CCGTTCGCTT TATTACTGTA TAAAACTAAA 67 80 

CCATTATCCC AAGTTAAGGT ATATCCTCTA TCATAATAAT ACTTATAAAG TTGCTCTGGA 6 84 0 

TGTCCTACCA TTTGTGTTCT AAAATCAACT TCATCAGTAC CATTTAAATA CTCTCCATCA 6900 

w 

TAGTGAACAA CATAAGTTTT ATCTAGATTT TCTATATTCA ATGAATAGCT TCCATTATTT 6 960 

TGTAAATTCA AATTCCCACT CATATTACTT GTGACTTCTT TAAATTTAGA AGTATCTGTC 7020 

GTATTTGCAT ATACACTCTT CGCTATGTCT TCATTATTAC CCAAGTATTC AAATATCCTA 7080 

ACTTTTGGTT GATTTCCATT CTGATTACTA CCTTTCATTA AAGTTCCAGT AACAGTCACA 714 0 

CTTGTCGTTT TACCATTATT AGGTTTAATA AATGCAACAT GCGAAAATCT ATTATTCGCT 7200 

20 TTATTAAATG TCTCAATCGA TCCATTTAAA TTGGCATAAT AATTCCCAAT ACCATCTTTA 72 60 

TATTTAACAT CTAATTCCTT TGAAGTTTGT TCTTCATTTA GTGTTGAAGT TATAGTTTGA 73 2 0 

TTTCCATTAG TTTGTACAGT TTTAGGATCA ATAAATAAAT TAATTTCTAG TTCAGCCGTT 73 80 

25 ACATCAACCT TAT CTTCAAT ATCATTTGTA AATGTATATC TAATCTTTCC ACCTTCTAAA 744 0 

ACTTCACCTG TCGCCATTAC GACTGAACCA TTTTTAATTT CTGGTACTTT TCTAGCAGTT 750 0 

GATACGCCAT GCGTATTTAC ATTATTTGAT AAAGTAAAGT CAAAGTAGTC ACCTTGATGT 756 0 

30 

AAACCATTCT CAAATTTCAA CTTATATTTT AGTACCGCTC GTTGTCCTGC ATGAGGTTCT 7620 

ACTTTATTTG TATTGTTATG CCCCTCAATA GAACCAATTT CTACTGTAAC TTTACTTGTT 76 8 0 

ACATCTGTAC CCGTTTCCAC TTTCGCGTTA CTAGCTTCCT TAGCTTCCGC TACATCTGCT 774 0 

35 

GATCTTGTCA CACGTGGCTT ACTTTCTGAT GCCGTTCTTG GCTGTGCCAC TTCAACTTGT 7800 

GTTTTTGCGA CTTGATTTTG TGTAGCCTTT TTAGGTGTTA AATCTACTTG TCTTTGATCT 786 0 

40 CCGCTATTGT CTTGAGATTG TGTTGTTTCC TTAACTTGAG GTTTCGCTTC TTCCTTAACT 792 0 

ACCTCTTCTT TAACTGTTTC TATATTTGCT GGTTGTGCAG TTTGTGGTGC TTGTACTGCT 7 9 80 

TTTGGTGCTT CTTCAGTTGT TACTTGTGTT GCGTTTGACG GTTGTTCTGT TACTGTTGCG 804 0 

45 TTATATGATT GAGTTTCTTC TATATGATTA ACGTTAGTTG CAGTTGTTTG TGTTTCACTT 8100 

GTTTTATTAT CAGTAGCTGA ATTCCCATTT TCTTCTACTG TAGTTGTCTT TTGTTCTGAT 8160 

GCTGCAGCTT CTTTGTCTTG TCCCATCCCA ACAACGATCA TTGTTCCTAA GAATACTGAT 8220 

50 GCTGCTCCCA ATTTATGTTT TCTAATGCCG TACCTAAGAT TGTTTTTCAC TATAATATCT 82 80 
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ATGTTAATTG ATAATTTTAT TATTTGAAAT ATACCTATAA ATTGTATTCA AGTCATCAGA 84 6 0 

AACCCTTGTC ACACAAGGCT TGTATTTTTT ATACTTATTT TTTAAATTAA ATTCATCATT 852 0 

5 

ATCTAATTTA AAACAATATA CTAAACGTTT CATAATTATC GCCTGTACAA TACGCACAAA 858 0 

AACATGTCTT GAAACGCCTT TCATTACTCT AAAATACCCA ATATACTTTT TATATCGTTC 8 64 0 

GGATTCTGAG TATTTCAGAC GATTTTCTGC ATAAAAATAA ACGTGTTTCA AGGCAATATA 87 0 0 

10 

TTGCAATTAC CTAAAAACAC GTTTACTTAA TATTTAGTTA AACAAATAAG CTAATGAATA 87 6 0 

AAATGAAGAT GATACCTGAA ACGGAAATAA TCGTTTCTAA TAATGACCAT GTTAAGAATG 8 82 0 

?5 TTTCTTTTAC AGTTAAACCA AAATATTCTT TAAACATCCA AAATCCTGCG TCATTTACAT 888 0 

GAGACAAAAT CACACTACCT GCACCTATCG CAAGTACAAC TAATGCAACA TTTACATCTG 8 94 0 

ATGATTGTAA TAATGGTAAG ACAATACCTG TAGTTGAAAT CGCAGCTACT GTAGCCGAAC 9000 

20 CTAATGCGAT ACGTAGCACA GCTGCAACAA TCCATGCTAG TAAAATCGGA GACATCTCTG 90 60 

TACCTTCAAA CATTTTAGCA ATTGTATTTC CGACACCGCC GTCAATTAAT ACTTGTTTAA 9120 

ATGTACCGCC ACCGCCAATA ATCAATAACA TCATTCCGAT TGGATAAATC GCATTCGTCA 9180 

25 CTGATTCCAT AATATGATTC ATCTTACGCT TTCTCATTAA TCCCATCGTA ACGATTGCAA 924 0 

ATAATACTGC TATTAGCATG GCTGTCCCTG CTGTTCCTAT CATATAAATG ATAGATTCAA 93 00 

ATAGATTTGT AGGTTTGTCA TGCCCAGTTA CAAGTTGCGT TATCGTAGAC ACTAACATTA 9360 

30 

ATATGACTGG TAATGTTGCT GTTAATAAAC TCATACCAAA TCCTGGCATC TCTTGATCCG 942 0 

TAAATTCTTT TTGTGCACCT AACGCTGAAA TATCGCCTTC TCGTGTATAC GCAGACGGAA 94 8 0 

TCATTTTTTG TGCAcTTTGT TAAATATAGG CCCTGCAATG AGTGTAACTG GaATGGCAAT 9 54 0 

35 

AATCATACCA TACAGTAATA CATCTCCAAC ATTTGCCTTT AATTCTTTTG CGATGACTAC 9600 

CGGTCCTGGA TGTGGTGGTA AAAAGCCATG TGTCACTGAT AAAGCTGTTA CCATAGGTAG 96 6 0 

TCCTAGTTTT AACACTGAAA CATTTGCGCG TTTTGCTACT GTAAATACTA ATGGAATCAG 972 0 

40 

TAAGACTAAA CCTACTTCAA AGAACAATGC AATACCGACG ATAAATGCTG CAACAAGCAT 9780 

TGCCCATTGT ACATGTTTTT GACCAAATTT TTGAATCAAC GTGTCTGCGA TTCGAGTTGC 984 0 

45 ACCACCACCA TCAGCAAGCA ATTTCCCAAG TATGGCACCT AAACCGAATA TCAGTGCAAT 9900 

GTGGCCGAGC GTACTGCCCA TTCCTTTCTC AATCGTCTCC ATAATTTTAG TCAATGGTAT 9960 

ACCTAGCATT AACGCTGTAA TCATCGATGT GATAATTAAT GAAATAAATG TATTTAATTT 10 020 

50 

AAAC CCAATA ATTAATACTA ATAAAATAAC GATACCTAAA ACAACACTGA TTAACGGCCA 100 80 

TATTTCGTTA AACATGACAT TCCCCTCTTT CTCTTTTCAA TAGAATGTAA CACCGTCGTC 10140 
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GAGTGACGTA TTTATTGTGT TTTATTTTCA GCGATATGTT GGCGTTGAAA ATCTGCAATT 10260 

TGTTCATAAT TCTCTGTTAA AGAACGACTT AAATTGATAA AAATGGATAC GATCTCTTGG 10320 

5 

TAAACAGTGA CATTTTCTTC AATCGGCGTA TGATTGTTTG TGGCACCGAC CATCGATGAA 103 BO 

ACGATTGAAA AATCTTCAAT GTCACCTACA GCTTTAAGTC CGAGCACGCA GGCACCTAAG 10440 

CATGAACTTT CATAACTTTC AGGAACCACT AACTCTGTGT CAAATATATC TGACATCATT 10500 

10 

TGACGCCATA CTTCACTTTT CGCAAAACCA CCTGTTGCTT TTATCATCTT AGGTGTTTCA 10560 

TTCATTACTT CAATAAGCGC AAGATAGACG GTATACAAAT TGTAAAGAAC ACCTTCTAAT 1062 0 

75 GCAGCGCGAA TCATATGTTC TTTTTTATGA GATAAAGTTA AACCGAAGAA TGAACCTCTT 106 80 

GCATTTGCGT TCCAAAGCGG CGCACGTTCT CCTGCTAAAT AGGGATGGAA TATTAAACCA 10740 

TCTGCACCTG GTTTAACACG CTTTGCAATT TGAGTTAAGA CATCATAAGG ATCAACACCG 108 00 

20 AGACGTTTCG CAGTTTCGAC TTCACTCGCT AGCAACTCGT CGCGCAACCA TCTCAATACG 10 86 0 

ACACCACCAT TATTTACAGG ACCTCCGATG ACGTAGTGGT CCTCTGTTAA GACATAACAA 10920 

AATATTCTAC CTTTGTAATC AGTACGCGGT TTATCTATCA CAGTACGAAT CGCCCCAGAT 10 980 

25 GTACCGATTG TGACAGCAAC TTCTCCTTTA CCAACACTAT TGACACCTAA ATTAGAAAGG 11040 

ACCCCATCAC TCGCACCAAT AACAAACGGT GTATCTTTAT TAAGCCCCAT TAATGTTGCA 11100 

TAACGTTCTT TCATACCTTT CAtCACATAC GTTGTTGGAA CTAATTCCGG CAACATTTCC 1116 0 

30 

TTGGAAATAC CCAGCAGTTC TAATGCCTCA ACATCCCAAT CTAATGTTTC TAAATTAAAC 1122 0 

ATCCCTGTTG CGGAAGCCAT TGAATAATCA ATGATATATG TATCAAATAA ATGATAGAAA 112 80 

ATGTATGTTT TAATATCTGC AAACTTAGCA GTACGTTGAA ATACATCTTG CCATTCATGT 113 4 0 

35 

TTCATCCAAA AAATCTTCGC TAATGGCGAC ATAGGATGAA TCGGTGTGCC TGTTCGCTGG 114 00 

TAAATCGCAT TGCCATCATG CACTTCATTT ATTACTGTTG CATATTTTGC AGCGCGGTTA 114 60 

40 TCTGCCCAAG TAATATTATT TGTTAATCTT TGATGTTGCT GATCCATCGC AATCAAGCTA 11520 

TGCATTTGCG CACTAAATGA CACAAACTTA ATGTCGTCTT TATTAACTTT GGATTCTCTC 11580 

ATAACATATT TAATAGTCAT TAGTACTGCA TCAAATAATT CATCTGGGTT TTCTTCTGAG 11640 

45 ACATCAACGT TTGGTGTGTG TAAATCATAG CCTATTTGAT GTTTCATGAT AAAAGTTCCA 11700 

TTTTCATCAT ATAAGACTGA CTTGGTACTC GTCGTTCCAA TGTCGACACC AATCATATAT 11760 

TTCATGATAA ATCCTTCTTT CTTTCATTTT AATTCAACCA AAATCCTTCA ATATCTTTAC 11820 

CAACATCGTC GAAATTTAAA TGAAACGCTT CTTTCAAAAT TTGACTGTCG TATTGTTCCA 11880 

. - - — — ^ ~ - —r- » ~^^^^r^'r<-^r ^^^frT^^r r:^r:TT r, ^ , ^ rrwT * 1 1 94 n 
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AAAATGAGTT TAAATATTGA TGATTAGATG CTTTGATTAA TGTTTCATGA AATTCAAAGT 12 06 0 

CATGCTTCGT AAATGATTCT GCATCCTCAA ATTTTACTGC CACTTTCATC ATTTCAAGTT 1212 0 

5 

GTTTCTTCAT TTCTTTTACG ATAGGTAGTC GCTCTTGATT TTTAACTCTT GAAAATGCAA 1218 0 

ATGACTCTAA CATCAGTCGC AAATCATACA TTTCTTTCTT TTCTTGTTCC CCAAACGGCA 12240 

ACACATGTGC ACCCATTCTT TCTAATTGGA TGAGTTGATT TTGTTGCAAT AATTTAAATG 123 0 0 

10 

CATCTCGAAT TGGCGAACGA CTCACATTAA ATTGCTTTGC CATTTGATTT TCAGTGAGTA 123 6 0 

ACGTACCTTC AGCTATGTGA CCATTCACAA TGCCTAAGCG TAATTCTGCC GCGATACCTT 124 20 

15 CTCCAGTTGT CATACCTT CC AACCATTTCT CTGGATATCC ATACATCATC AAAGTCACTC 124 80 

CTTCATTACA CGACATACTT GTATACAAGT ATGTTAATAT AGTTATTATG AGTTTGCAAG 12540 

CGCTTTCTTT ACGAGCACTA AAATAGTGAC CACCCCTTTT CGATTTAAAT TTAAAGGAAA 12 600 

20 TGGTCACTAT CACACGAATG ATTTAATTGT TATGTTGTAT GTGGGATATT TCTAATTGTT 12660 

CTGTACTCAT ATGCGCTTTA GGTACTTCAA TGCAATAATG CGTTTCATGA CAGTTTGGAC 12 720 

ATTCGAATCG ACGTGTTGTC GCTGTATGTT TCGCTTTGAT AACTGCCCAC AAAGATGGTG 12 78 0 

25 AGAATATATG CTGGCAGTTA GGACATAAAT AGGCAACCTT TTGTTGGTAA TAAAAAGTAA 12 84 0 

CACCAATGCC ATAACCAATC ATAAATGGTA AAGCAATTAA AAACGGCCAT TTATTTTTCA 12 900 

TCAAAATTGC ACTTATAATG CTAGAATATT GAATTATTCC TATAATACCA GCACTAATCC 12 960 

30 

AAATGTTACG ACGAATACTT TTCATTTCAG CTGATTTACT CATGACATGC TCTATGTCTT 13 020 

TTAAGTGTGT GATTGGAGAC GTCGACGCTT CATTTACGTA ATATTGAACA TTTTTAATTT 13 080 

TGTTTAATAC CGCTTGTTGC TGTTTAACTT GTTGGTTAAT TTCTTGTTGT TTCATAGTTA 1314 0 

35 

GTAAAGTATT GAGCGTCTTC AAAGTACCTT CACCTTTTAG CAACATATCT ATATCGCTTA 132 00 

ACGC&CAACC TAAATCTTTA AGCAATAAGA TTAACTCTAA TGTTTGTCGC TGTTGTTCTG 13260 

40 TATACACACG ACGCTTTCCT TCTGTAAATC CTTGTGGTTT CAAAATACCT TTGCGATCAT 133 20 

AATATTGAAT CGTTCGTGTT GTCACATTGC ATAATTTTGC GAGTTCTCCA GTCGAATAGT 13 3 80 

TAGACATAGA TTCCACCTCC TATAATTACC ATAGTTGATG ACCCGACGTC ACGAGCAAGT 13440 

45 ACAATTTCCA CATTTTAAAG AAATTTATTA TACTAGGCGT CTTATTTTTA TGATTTCGTA 13500 

CCATGTTGAT TTACAAACTC ACTCAAACTA AGTAACACAC CTACTAAACA TCTACTCTGT 13560 

TATTTCAGAA TGAATTTGTT GTAATTTATC TTCAACTTCA GTAATCTCTG TCGCACATTC 13 620 

50 

TTTCAGTAAA TCTCGATACT TTTCCGTCTC TGCATTGTTT TTATAACGTA TTTTATGTTC 13 68 0 

TAAACTTGcC CACATATCCA TACCTATCGT TCTAATTTGA ATTTCAACAG GCAATACCTC 1374 0 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

GGATAAGTTC AGGTAAATTC ATTTCTTTTT CAATTTTGAT TTTCATTGTT TCCGCCCTTT 6 0 

5 TAAAATAAAG TTAGTTGCTT CTGTTCCTCA TATTCCAAAT CACTTTGCTT TATATATGTT 120 

TCAAGCTCTT CCGCTGTATC AAATGTCTTT TTCACACCTT GCCAACCTGG CACGATATGA 180 

CCGTGAAAGT AATAAGTGCC ATTTACTACA TGGATATGTG CCACTCGTTC GTTATCCTGA 24 0 

20 TACAGATATC TCTTAGATCC AAAGAATTGA TTTAGGTATT CTTTACGCGC GCTATCTGTC 30 0 

ATGGTCATCA CTCCTTTTAA CAATTAGGCA GACCAAACGA CATGCATTCG TCGTATAGCT 36 0 

CTTCATTACT TATGCTTGCC TTATAGTTTT CAATCACATT GCTAACTTCT TTATGACTCA 4 20 

25 TTGCTTTAAC TTGTTCGTCT GTATATTTTT CGCAGTCTTC TAATTCCAGT TGCTCCTGTA 4 80 

ATGACATCAC ATATTCAACT TGTCTTTGGG TTGCCATCGT TAACCCTCCC ACAAGTCAAA 54 0 

AGCTCTTTGG ACGTAAAACT TCGCCTTTGC TAAATCCTCA TGACCATTCT TTAACGGTGC 6 00 

30 

TCTAGACATG TATTTGATTG CATTACCTAT TGCGAATGCT AGTTGAGGTG GATACTGTGC 660 

CGTAACCTGT TCGATAAAAT CTATAATTTC AATGTCGCCG TATGTGTAGT GCGCTGGTTG 72 0 

CTTAACATTG TCTTGCGCTT CGTTCATATC TACTTTTCTG TTACTGATTA CGCTCATTAT 780 

35 

GCTTCACTCC ATTTCTTGAA CATTTGGTTA TAAGTGACAT CGAACCAGTA CGGATCACGT 84 0 

GAATSTTTTT GTGGCGTTCC ATCATAAAGC CATGGTCTTA ATCTTCTCTT TCTTTCCTGT 900 

TCATATTCCG CTCTCACATT TCGTTGGTAT CGGTTCAAAA TCGCTTTTTT TCTGATTTTT 960 

40 

TCTCTCCCTT TTTCTTCATC TTTnATtTGA CTCTnCATAT ATTCAACTTC TTCTGTAGAT 1020 

nTTGAGTCCT TTCTTCCACA CAATAATTCA nCGCCGCGC 10 59 
45 C2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30246 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
50 (D) TOPOLOGY: linear 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA ACAATGGATG AAATTGAATA TGTCGGGACA 60 

ATTGTAGGTC CTG CAT ATC C ACAACAGGAT ATGTTAACTG AGTTAAATGG ATTTCGCGCA 12 0 

5 

TTAACCAAAA TCGATTGGGA AAACGTAACT ATCAATAATG AAATTACGGA TATACGCTGG 180 

ATTGATAAAG ATAATGATGC GTTGATTGCG CCTGCTGTCA AAGTTTGGAT TGAAACTTAT 240 

GGTGGTAAAC ATGACAAATA ATGACACCAT CATGTTACGA CATTATGTCC CACAAGATTA 3 00 

10 

TTCGATGTTA GAAGCTTTTC AATTAAGTGA AAGTGATTTG AAGTTTGTTA AAACGCCAGA 3 60 

GGAAAATATT ACAGCTGCAA TGTCTGATAA TGAAAGGTAT CCCATCGTTG TAATGGATGG 42 0 

15 CAGGCAATGT GTGGCCTTTT TTACATTACA TCGTGGAAAA GGGGTCGCAC CATTTAGCGA 4 80 

TAACCAAGAT GCAGTATTTT TCAGGTCATT TAGTGTTGAT CAACGTTATC GTAATAGAGG 54 0 

AATAGGTAAA GTGGTAATGG AAAAATTGGC GTCATTTATC ACTTCAACAT TTCAGGATAT 60 0 

20 TAATGAGATT GTGTTAACGG TTAATACTGA CAATCCACAT GCCATGGCAC TTTATCGCCA 660 

ACAAGGATAT CAATATATGG GAGATAGTAT GTTCGTCGGA AGACCTGTTC ATATTATGGC 72 0 

GTTAACTATA AAATAAATTA AATTTAAAAG CATCTTTACT CATCGTCGAC CACAACAATT 780 

25 AATGATGAAT AAAGGTGCTT TTTGTTATAG ATCATCGGAC AATTTACTAT AGTAAAAAGC 84 0 

GACCTAGTGA ACAATTGACA TATATCCACA GGTCGCTTAA CTTAAGTTAT ATTGCTAGTT 900 

GCGATTAATT GATAGACTCA TCATTTTTGC GCTGTCGAGA TGGTCTTTTT ATTAAAAATG 96 0 

30 

CCGTAATCCA AGCCGTAATC GGAATACTGA TTGCAACGGC AATACCGCCT AAAATAATAG 1020 

AAATAAATTC TTGGGCAAAT ATTTTCGAGT TTATAATATG AC CAAATG AA TATTTAAGTT 1080 

TGAAAAACCA AATAAATAAA GCAAGTTGGC CACCAAAAAA GGCAAGGTAA ATCGTGTTCG 1140 

35 

CAGATGTCGC TAAAATTTCT CTACCAACAC GCATGCCAGA TTGGAATAAT TCGTATTGCG 1200 

TAACGTTgGA TTCACTTGAT GCAATTCATA AATGGGTGAA CTAATGGTAA TTGTTAAATC 1260 

4Q TATCACAGCT GCAATAACAG CAAGAATAAT AGTGAACACC ATAAATTGAA CCATATCAAT 132 0 

GCCAATATTC ATTGAATACA CATATGTTTC ATCTTGTTGT TCGGTTGaAA AGCCTTGTAG 13 80 

ATGACCGAAG TAGACCGATA AATAAATGAG TGTAATCAAC AATATTGTTG TAACGATAgT 144 0 

45 GCtGgATAAA TGCaGCTTGT GTTTTAACAT TGTAACTATT GAGTACGAAT AAATTACAAG 1500 

CGCCAATAAT AATGCAGAAA AAGAATGTGA CGACATAAAT CGGTACGCCA AAAATAATCA 1560 

ATACAATACT AATAATTAAA ATAGCGAAAT TTAAAAATAG GGTTAAATAA GAGATGAATC 1620 

50 CCTTTTTACC TCCGAAAATT ATCATCAGAA AGAGGAGCAA TAACGCCAAT ATAAATACAG 1680 

CATTCATTGT TTCGCCCTCC TTAATGTTTC AAATATTTCC ATAAACAATA TTGTGATAGG 1740 
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CATCGAAATA GTATAAGTCA CTGTATTGGC ATTTTTTAAA AAGATTAAAA ACATAGGTAG 186 0 

TGCACCGGAT AAATATGAGA ATAATAAGAT GTTAGTCATT GTTCCCATAA TATCTTGGCC 192 0 

5 GATGTTTCGC CCAGCAAGCG CCCATCTCCT CATTGAAATG TGTGGCGTAC GCTGTAAAAT 19 80 

TTCATGCATA CCACTAGCAA TTGTAATTGC AACATCCATA ATAGCGCCAA GTGAACCTAT 2 04 0 

TAACACTGAG GCTAGGAAGA TATCTTTCGG TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2100 

TTTAATGCCT TTACCATCTG T CAT AT AT AT GATTAATTCT GTTAAACCTA TACTCAAAAA 216 0 

AGTTCCGATA ATTGTACTGG CTATGGTAAT GAGTGTACGC ATATGCCAGC CTGTAACGAG 2 22 0 

CAATAAAGTG AGTATTGTTG AACAGATCAT GG CAATGGTC ATGAGTAAGA ATAAATTAAT 2 28 0 

15 

ATTGCTATGT -pGAATATGAA TGTAAATTGC GATTAATATG GCAATAGAAT TCAAGATTAA 2340 

CGATAAAATC GATTGCAGTC CGACTTTGCG ACCAACCAAT AATACAGTTA ATAAGAACAA 24 00 

ACCAGTGATG ATAACCGTTA AGGTATCACG CTTCTTTTCT ATAATATAAG CATCACTCGG 24 6 0 

20 

CTTGTTAGAA ATATGTAATA ATACTTTTTC GTGTGTGCGA AATGCCTCAG AATCTGCTTG 2 520 

CGATTTGACG TACTGATGAT TAATCGTCGT CGTTTCTCCA GCAAATTGAC CATTTAATAT 2 580 

25 TTTGACTTTT AATTGATTTT TATATTTAAT ATCACGATTA TTTTGTGCAT CTTTTGTAGG 2 64 0 

TGTCGAAGAA ACATGTTTGA CATCTATAAT TTGACCAATT GGTTTGTTGT AAAAGTTCTC 2700 

ATTATTGAAT GTAAATAAAA TAGCACCAAT GAATGCGATG CAGAACAAAC CTAAAATTAT 2760 

30 ATTAAATGGC TTTGTAAATA AATTTCTATA TTTCAAAAAC AAAACCCCAA TTCTATGAAT 2 820 

GAATTAATAT GGTGATTATA CGCCCTTAAT TTTTTATTTT CAAAGATATT ACTGCTAAGT 28 80 

GTAAAACGAA AATCATCATT GATAGCATCG AATTACTTAA TGGAATGTAG ACGTTTTAGT 2 94 0 

35 CATTAATTGC TGAATAAGTG TTAATAATAT GCCAATATCA CTCTTTGTAT AAGGCTCCTT 3 000 

TGTjyVTAGCA CATATCGTTC TTTTTAATTC AGTATGATCT AATTTTATAT CTATCCATGA 3 060 

TTTAGATTCT GGTAAATGTA TATTTTGTGA TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

40 

GAGATAcTGC GCAAGTGGTT GGCTACTGAT TGTGTATACA TCTGATTTAG TAATCTTGCG 318 0 

CAATTGTTTT TTTACAGTTT CGGCAAATGG TGCCAAGCAA TAAATATGAC TATGCTCAAA 3 24 0 

CTGAATTAAT GGTGGGTGTG TCGCCATCGT AATTGGATCG TCTGAAGGCG CATATAAATG 330 0 

ATAGTGCTCT TCGAATAAAG GTAGCATATG TAATTGTTTG TGTTTACGTA TTTCTGGTGT 3 3 60 

AAGTTCCGTG AAACCAATGT CTATATTCCC ATTTAATACG CTATTTATAA TTGTGTCATG 34 20 

50 TTCTAATAAG CTCGGTATGA CATGTGTATC ATTTTGTAAA TGAAACGTTT GGATAAGTGG 34 80 

TAGTAACATG TGGGATACGT CACTCTCATC ATAGCCAATG TAGATACTTT TATTTTTAGT 3 54 0 
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15 



TTCATTAAAT AATAATTTCC CTTCAGATGT GAGCGTAATA TTGCGTCCTT GCTTTTTAAA 3 6 60 

TAAAGACACA TTAAGTTCTT GTTCTAATAA TGTAATTTGA CGGCTTATCG CTGATTGAGC 3 720 

AATGTTTAGT TCAAGTGCTG TTTCGGAGAT ATGTTCTCTT TTAGCGACCT CGATAAAATA 3 730 

TCTTAATTGT TTAATTTCCA TAGCGATATA GGCACCTCCA AAAATGAGTG TTTTGTAACT 3 84 0 

ATTATAGCAA TATTATTGAT AAATGTTCTA TTTTTTAGAT GAATATCTTC TATTTTATAT 3 9 00 

ATTGAACAGA TAAATTTTTT AGATTATAGT AATTATCATT AATAACTAAT ATCAGAATAT 3 960 

TCTAAAAAAG GGGTGTGCAT CATGCACAAT GAGAAATTAA TTAAAGGCTT ATATGACTAT 4020 

CGTGAGGAAC ATGATGCGTG TGGTATTGGT TTTTATGCGA ATATGGATAA TAAAAGGTCT 4 08 0 

CACGACATCA TTGATAAATC GCTTGAAATG TTGCGACGCT TAGAT CACAG GGGCGGGGTC 414 0 

GGCGCAGATG GCATCACTGG TGATGGCGCA GGTATTATGA CTGAAATACC TTTTGCATTT 42 00 

20 TTCAAACAAC ATGTAACGGA CTTTGATATC CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4260 

TTTTTTTCCA AAGAACGCAT TTTAGGTTCT GAACATGAAG TAGTTTTTAA AAAATATTTT 4 3 20 

GAAGGCGAAG GGTTATCAAT TCTTGGTTAT CGTAATGTAC CAGTTAATAA AGATGCCATT 43 80 

25 GCTAAACATG TAGCAGATAC GATGCCAGTC ATTCAACAAG TGTTTATTGA TATTAGGGAC 444 0 

ATTGAAGATG TTGAAAAGCG TTTGTTTTTA GCGAGAAAAC AATTAGAGTT CTATTCGACT 4 500 

CAGTGCGATT TAGAATTGTA TTTTACGAGC TTATCACGCA AAACAATTGT ATATAAAGGT 4 560 

TGGTTACGAT CAGACCAAAT TAAAAAACTA TATACAGATT TATCGGATGA TTTATATCAA 4 620 

TCAAAGCTAG GGTTAGTGCA TTCGAGATTT AGTACGAATA CATTCCCGAG TTGGAAAAGG 4 6 80 

GCACATCCTA ACCGTATGTT AATGCATAAT GGTGAGATTA ACACGATTAA AGGTAATGTA 4740 

AACTGGATGC GAGCACGCCA ACATAAATTA ATCGAAACAT TATTTGGCGA GGATCAACAT 4 800 

AAAGTGTTTC AAATTGTCGA TGAGGATGGT AGTGACTCTG CCATTGTAGA TAATGCGCTA 4B60 

GAGTTCTTAT CGTTAGCCAT GGAGCCAGAA AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4 92 0 

TGGTTATATA ATGAAGCGAA TGATGCAAAT GTACGTGCGT TTTATGAATT TTATAGTTAT 4 98 0 

TTAATGGAAC CGTGGGATGG TCCTACAATG ATTTCGTTCT GTAACGGTGA CAAACTTGGC 504 0 

GCGCTTACAG ATAGAAATGG ATTACGTCCA GGTCGTTATA CGATTACTAA AGATAACTTT 5100 

ATTGTCTTTT CATCTGAAGT GGGTGTTGTG GACGTACCTG AAAGTAATGT TGCTTTTAAA 516 0 

GGTCAATTGA ATCCTGGAAA GTTATTGCTT GTTGATTTTA AACAGAATAA AGTCATTGAA 522 0 

50 AATAATGATT TAAAAGGTGC GATTGCTGGA GAATTACCAT ATAAAGCGTG GATTGATAAC 52 80 

CATAAAGTTG ACTTTGATTT TGAAAATATA CAATATCAAG ATTCGCAATG GAAAGATGAG 534 0 
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CAGGAACTTG TAGAAGGTAA GAAGGATCCT ATCGGTGCAA TGGGATATGA TGCGCCAATT 54 60 

GCAGTGTTGA ACGAGCGACC AGAATCACTA TTTAATTACT TTAAACAGCT GTTTGCACAA 5520 

5 

GTTACGAATC CACCAATTGA TGCGTATCGT GAAAAAATCG TAACGAGTGA ACTTTCTTAT 55 8 0 

TTAGGTGGCG AAGGTAACTT ACTAGCACCT GACGAAACGG TTTTAGATCG TATTCAATTG 564 0 

AAAAGGCCGG TATTGAATGA ATCACACTTA GCAGCGATTG ATCAGGAACA TTTTAAATTA 5700 

10 

ACTTATTTAT CAACGGTATA TGAAGGGGAT TTGGAAGATG CGTTAGAAGC ATTAGGCCGA 5760 

GAAGCAGTGA ATGCTGTAAA GCAAGGCGCT CAAATTCTAG TGTTAGATGA TAGTGGATTA 582 0 

15 GTTGATAGCA ATGGCTTTGC AATGCCGATG TTACTCGCAA TAAGTCATGT GCATCAATTA 5880 

CTTATTAAAG CAGATTTACG TATGTCTACA AGTTTAGTCG CTAAATCTGG TGAGACACGA 5 94 0 

GAAGTGCATC ATGTTGCTTG TTTACTCGCA TATGGCGCGA ATGCAATTGT GCCATACCTA 600 0 

20 GCGCAACGTA CAGTTGAACA ACTGACATTG ACAGAAGGGT TACAAGGCAC CGTTGTCGAT 606 0 

AATGTTAAGA CATATACGGA TGTATTGTCA GAAGGTGTCA TTAAAGTAAT GGCTAAGATG 6120 

GGAATTTCGA CAGTGCAAAG TTATCAAGGG GCACAAATAT TTGAAGCGAT TGGCTTGTCT 618 0 

25 CATGATGTGA TTGATCGTTA TTTTACTGGG ACACAGTCTA AGTTATCTGG TATTTCGATT 624 0 

GATCAAATTG ATGCTGAAAA TAAAGCACGT CAACAAAGTG ATGATAATTA TCTTGCATCA 63 00 

GGTAGTACAT TCCAATGGAG ACAACAAGGT CAACATCATG CTTTTAATCC GGAATCTATT 63 6 0 

30 

TTCTTATTGC AGCACGCATG TAAAGAAAAT GACTATGCGC AATTTAAAGC ATACTCTGAA 6420 

GCGGTGAACA AAAATAGAAC AGATCACATT AGACATTTAC TTGAATTTAA AGCATGTACA 64 8 0 

CCGATTGACA TCGACCAAGT TGAACCGGTA AGTGACATTG TCAAACGCTT TAATACAGGG 654 0 

35 

GCGATGAGTT ATGGATCGAT TTCAGCGGAA GCACATGAAA CGTTAGCACA AGCCATGAAC 6600 

CAATTAGGTG GAAAGAGTAA TAGTGGTGAA GGTGGCGAAG ATGCAAAACG TTATGAAGTA 6660 

CAAGTTGATG GAAGCAACAA AGTAAGTGCG ATTAAACAAG TTGCTTCTGG GCGTTTTGGT 6720 

40 

GTAACTAGTG ATTATTTACA ACATGCCAAA GAAATTCAAA TTAAAGTTGC GCAAGGTGCA 678 0 

AAGCCTGGTG AAGGTGGTCA ATTACCTGGT ACTAAGGTAT ATCCGTGGAT TGCGAAGACA 684 0 

4S AGAGGGTCAA CGCCAGGTAT CGGTCTGATT TCACCACCGC CACATCATGA TATTTATTCA 6 90 0 

ATAGAAGATT TAGCGCAACT GATACATGAT TTGAAAAATG CGAATAAAGA TGCAGATATC 6 96 0 

GCGGTAAAAT TAGTTTCGAA AACAGGTGTT GGTACCATTG CATCTGGGGT GGCAAAAGCA 702 0 

50 TTTG C AG ATA AAATTGTCAT CAGTGGTTAC GATGGTGGTA CAGGGGCTTC ACCCAAAACG 70 80 

AGTATTCAGC ATGCCGGTGT TCCTTGGGAG ATTGGTTTAG CAGAAACACA TCAAACATTA 714 0 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA GCGGAAGAAT TTGGATTTGC AACTGCACCA 726 0 

TTAGTGGTGT TGGGCTGTAT TATGATGCGT GTATGCCATA AAGATACATG TCCAGTAGGA 7320 

GTTGCAACTC AAAACAAAGA TTTACGTGCT TTATATAGAG GTAAAGCACA TCATGTTGTT 7 3 80 

AATTTTATGC ATTTTATTGC ACAAGAATTA AGAGAAATTT TAGCATCTTT AGGTTTGAAA 744 0 

CGTGTAGAAG ACTTAGTTGG AAGAACTGAT TTATTACAAC GATCATCAAC ATTAAAAGCG 750 0 

10 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7 560 

ACAAAAGAAA TTCAACAAAA TCATAATCTT GAGCATGGAT TTGATTTAAC AAATTTATAT 762 0 

GAAGTAACGA AGCCATATAT TGCTGAAGGG CGTCGCTATA CAGGTAGCTT TACAGTAAAT 7680 

15 

AATGAACAAC GTGATGTAGG GGTTATTACA GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7740 

GCAGGACTTC CTGAAAATAC AATTAATGTT TATACGAATG GTCATGCTGG TCAAAGTCTT 7800 

2Q GCAGCATATG CACCGAAAGG CTTAATGATT CATCATACTG GAGATGCGAA TGACTATGTT 7 86 0 

GGTAAAGGAT TATCTGGTGG TACGGTCATT GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7920 

GAAATTATTG CTGGTAACGT CTCATTCTAT GGTGCGACAA GTGGTAAGGC ATTTATTAAC 798 0 

25 GGTAGTGCAG GAGAAAGATT CTGTATTAGA AATAGTGGTG TAGATGTTGT CGTTGAAGGT 804 0 

ATCGGCGACC ATGGATTAGA GTATATGACT GGTGGACATG TCATTAATTT AGGTGATGTA 8100 

GGTAAGAACT TCGGTCAAGG TATGAGTGGT GGTATTGCTT ACGTTATCCC GTCTGATGTA 8160 

30 GAAGCTTTTG TTGAAAATAA TCAACTAGAT ACGCTTTCGT TTACAAAGAT TAAACACCAA 822 0 

GAAGAAAAAG CATTCATTAA GCAAATGCTG GAAGAACATG TGTCACACAC GAATAGTACG 8280 

AGAGCGATTC ATGTGTTAAA ACATTTTGAT CGCATTGAAG ATGTCGTCGT TAAAGTTATT 834 0 

CCTAAAGATT ATCAATTAAT GATGCAAAAA ATTCATTTGC ACAAATCATT ACATGACAAT 8400 

GAAGATGAAG CGATGTTAGC TGCATTTTAC GATGACAGTA AAACAATCGA TGCTAAACAT 8460 

AAACCAGCCG TTGTGTATTA AGGAAAGGGG GAGATACGAT GGGTGAATTT AAAGGATTTA 8520 

40 

TGAAGTATGA CAAACAGTAC TTAGGTGAAT TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

AAGCATATCA ACAACGATTT ACTAAAGAAG ATGCCTCTAT CCAAGGTGCA CGATGTATGG 864 0 

ATTGTGGAAC GCCGTTTTGT CAAACCGGAC AACAGTATGG TAGGGAAACA ATAGGTTGTC 8700 

45 

CAATTGGAAA CTACATTCCT GAATGGAACG ACTTAGTGTA TCATCAAGAT TTTAAAACTG 8760 

CTTATGAACG CTTAAGCGAA ACAAATAACT TTCCTGACTT TACAGGGCGT GTATGTCCTG 8820 

50 CACCATGCGA AAGTGCTTGT GTGATGAAGA TTAATAGAGA ATCGATTGCG ATTAAAGGTA 88 80 

TTGAACGCAC AATTATTGAT GAAGCTTTTG AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 8940 
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CTGAAGAACT TAATCTACTA GGATATCAAG TAACTATTTA TGAACGTGCT AGAGAATCAG 90 6 0 

GCGG TTTATT AATGTATGGT ATTCCGAATA TGAAACTTGA TAAAGATGTG GTTCGACGTC 912 0 

5 GTATTAAGTT AATGGAAGAA GCGGGCATTA CTTTCATTAA TGGTGTTGAA GTCGGTGTTG 918 0 

ATATTGATAA AGCAACGTTA GAATCTGAGT ATGATGCCAT TATATTATGT ACTGGTGCAC 924 0 

AAAAAGGTAG AGATTTACCT TTAGAAGGAC GCATGGGTGA TGGTATACAT TTCGCTATGG 9300 

10 

ATTATTTAAC TGAACAAACG CAGTTGTTAA ATGGAGAAAT TGATGATATA ACAATAACTG 9360 

CAAAAGATAA GAATGTCATT ATCATTGGTG CTGGTGATAC AGGGGCAGAC TGTGTAGCGA 94 20 

CAGCATTAAG AGAAAATTGT AAATCGATTG TTCAATTTAA TAAATATACG AAATTGCCAG 94 80 

15 

AAG CAATTAC ATTTACAGAA AATGCATCAT GGCCTTTAGC AATGCCGGTG TTTAAAATGG 954 0 

ACTATGCGCA CCAAGAGTAC GAAGCTAAGT TTGGTAAGGA ACCACGTGCA TATGGTGTTC 9600 

AAACAATGCG TTACGATGTT GACGATAAAG GACACATACG TGGTTTGTAT ACTCAAATTT 966 0 

20 

TAGAGCAAGG CGAAAATGGT ATGGTCATGA AAGAAGGACC TGAAAGATTT TGGCCTGCTO 9720 

ACCTTGTATT ATTATCAATC GGCTTCGAAG GTACAGAACC AACAGTACCG AATGCTTTTA 97 80 

25 ACATTAAAAC GGATAGAAAT CGAATCGTGG CGGATGATAC AAACTATCAA ACTAATAATG 9 840 

AAAAGGTATT TGCTGCTGGA GATGCTAGAC GTGGTCAAAG TTTAGTTGTA TGGGCAATTA 9900 

AAGAAGGTAG AGGCGTAGCG AAAGCAGTAG ATCAGTATTT AGCTAGTAAA GTTTGTGTAT 9 960 

30 AATCTTTGTA TGGAAATGGT GGTTACGTTG ACGTTGTGAC ATGCTGAATC GAGTTTGAAA 10 020 

AAATCTAGTA TCTATCAACG TCACATGCCA TCTTTGTAAC CTAAAAACAA AGGTTTGTAA 10080 

GACAACAAAT AGATTAATTA TAAGTAGTGA TTTTTTACAT TCGTTTATAG GTCAACTGTA 10140 

35 GTGGAAGACA ATGATTTGTG GTAATCATGT AATGCTTAAA AACAATATTG ACTTTTACAG 10200 

AACGTTCATA TATGATAAAT ATTGTGTTTA GGAGGAATAC CCAAGTCCGG CTGAAGGGAT 10260 

CGGTCTTGAA AACCGACAGG GGCTTAACGG CTCGCGGGGG TTCGAATCCC TCTTCCTCCG 10320 

40 CCATCAATAT TTATATTAAA TTCTATATAT AATGAAGGTA AGTGCTCAAA TTTTGAGTAT 103 80 

TTACCTTTTT TATTTGTCTT TGAATGGCTC GTAATTTTTG ATAATAGAAA TGATAAGGCA 1044 0 

TTGAGATTGG AAGGGCATTT GGCTTGTGCA ATATACATAG CTAAATGTCT TTTTTGTTTT 10 500 

45 

GTGAAATATG ATGGATGGCT TGTGTGGACA AGTTTGCTAT TTATAGATAT G CATTTTT C A 10 560 

ATTTAGGAGT TGGCCATGCA TCTACACTTT ATAATGGTGA GAGCGTGGTG AGGTATTGTT 10 620 

AATAACGCAA TTGTAGCGAG GAGTTATTGC TACATATGTC GTTATGGCTC ATTGATTTTC 106 80 

50 

TGAAATGGCT ACCCCAGATA ATTGTGACAA AATAAAAATA TTTTGTTGAA AGCCTTTACA 10 740 
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TAAAAAGAGA AGATGTAAAA GCCATCGTAA CCGCTATTGG GGGAAAAGAA AATCTTGAAG 10860 

CTGCAACGCA TTGTGTAACA CGATTACGTT TAGTGCTGAA GGATGAAAGT AAAGTTGATA 10 920 

5 AAGACGCATT AAGTAATAAC GCGTTGGTCA AGGGGCAGTT TAAAGCAGAC CATCAATATC 10 980 

AAATTGTCAT TGGTCCAGGA ACAGTCGATG AAGTGTATAA GCAGTTTATT GATGAAACAG 11040 

GTGCTCAAGA AGCTTCGAAA GATGAAGCGA AACAAGCAGC TGCACAAAAA GGGAATCCAG 11100 

10 TACAACGTTT GATCAAATTG TtGGGGGATA TTTTTATACC AATATTACCT GCGATTGTGA 11160 

CAGCTGGTTT GTTAATGGGA ATCAATAATT TACTTACAAT GAAAGGTTTA TTTGGTCCAA 11220 

AAGCACTTAT TGAGATGTAT CCACAAATTG CTGATATTTC AAACATCATT AATGTGATTG 11280 

15 

CGAGTACGGC ATTTATTTTC TTACCAGCAT TAATTGGTTG GAGTAGTATG CGTGTATTTG 1134 0 

GTGGTAGTCC GATTCTAGGC ATAGTCTTAG GTTTGATTTT AATGCATCCG CAATTAGTAT 114 0 0 

CTCAGTATGA TTTGGCAAAA GGGAATATTC CGACGTGGAA CTTATTTGGC TTAGAGATTA 114 60 

20 

AGCAGTTGAA TTACCAAGGT CAAGTGTTGC CAGTtTTAAT TGCAGCTTAC GTTCTAGCTA 1152 0 

AAATTGAAAA AGGATTAAAT AAAGTCGTTC ACGATTCGAT AAAAATGTTG GTCGTTGGAC 1158 0 

25 CCGTAGCGCT TTTAGTTACT GGATTTTTAG CATTTATTAT CATTGGACCA GTTGCGTTAT 1164 0 

TGaTTGGTAC AGGTATTACA TCTGGTGTTA CATTTATATT CCAACATGCA GGATGGCTTG 1170 0 

GCGGAGCAAT ATATGGATTG TTATATGCAC CACTTGTAAT TACAGGACTA CACCATATGT 11760 

30 TTTTAGCAGT AGATTTCCAA TTGATGGGTA GCAGCTTAGG CGGTACGTAT TTATGGCCAA 1182 0 

TTGTTGCGAT TTCCAATATT TGTCAGGGCT CTGCAGCATT TGGAGCATGG TTTGTCTATA 118 80 

AACGTCGTAA AATGGTTAAA GAAGAAGGCT TGGCATTAAC ATCTTGTATT TCTGGTATGT 11940 

35 TAGGTGTTAC TGAACCAGCC ATGTTCGGTG TGAACTTACC TCTGAAATAT CCATTTATCG 12000 

CTGCGATATC AACGTCTTGT GTATTGGGGG CAATCGTTGG TATGAATAAC GTACTTGGAA 12 060 

AAGTTGGTGT TGGTGGCGTG CCAGCATTCA TTTCAATTCA AAAAGAATTT TGGCCAGTAT 1212 0 

40 ATCTTATTGT GACAGCTATT GCTATTGTTG TACCATGTAT ACTAACAATT GTGATGTCTC 1218 0 

ATTTTAGTAA ACAAAAAGCG AAAGAAATTG TTGAAGATTA ATAAAATAAA AAAGGGGCGT 12240 

TCGTTATTTG GACGTCCTTT ATTACGTTAT AAGGTGGTAA TTGTGTGTCG AAAGAAATAG 123 00 

45 

ATTGGAGAAA ATCCGTTGTA TATCAAATTT ATCCTAAGTC GTTTAATGAT ACGACGGGGA 123 60 

ATGGTATAGG AGATATCAAT GGAATTATAG AAAAATTGGA TTATATCAAG TTATTGGGTG 12420 

TTGATTATAT TTGGTTAACA CCAGTGTATG AATCACCGAT GAATGATAAT GGCTATGATA 124 80 

50 

TCAGCAATTA TTTAGAAATC aATGAAGACT TTGGAACGAT GGATGATTTT GaAAAGTTAA 12 540 
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CGACGGAGCA TGaATGGTTT AAAGAAGCCC 
ATTACTTTTT CAGATCATCT GAAGACGGGC 

5 GTAATGCATG GAAGTATGAT TCTGAGACAG 

GTCAAGCTGA TTTAAATTGG GATAATCCGG 
ATCATTGGAT AGACTTCGGC GTTGATGGTT 

W AAGGTGAATT TAAGGACTCT GACAAAATAG 

TGCATGAGTT TCTGCATGAA TTAAATCGTC 
TAGGAGAAAT GTCTTCGACG ACGATTGAAA 

15 

AAGAATTGAA TAGTGTTTTT AATTTTCATC 
AGTGGACAAA TGCGAgcTTG nATTTTCATA 
GAGGTATTTA TGACGGTGGC GGATGGAACG 

20 

GGGTAGTGTC TAGATTTGGT GATGATACGT 
TGTTAGCTAT CGCACTGCAT ATGTTGCAAG 
TTGGTATGAC GGACCCACAT TTTACATCAA 

25 

ATGCCTACCA TCAGTTGTTA AGTGAAGGGC 
GACAGAAGTC ACGAGACAAT TCGAGAACGC 

30 GATTTACAGC TGGTAAnCCT TGGATTGATA 

GACAAGCACT TCAGAATAAA GAGTCTATTT 
GACATACGCA TGATATTATT ACGTATGGAG 

35 ATTTATTTGT TTATGAACGT CATTATAAGA 

CAGC&TCGGC TGTTGATTTG CCAGAAGGAT 
CAGGCACAGT GGAAAATAAT ACGATAAGCG 

40 CGTAAAATAA ATTGAGTGGA TGCGTTTATA 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG 
CTGAACATGA TTTGGTGCAA TTGTACCAGT 
ATTTGTTGGC ATTAGACGGC ATGATTCAAA 
ATCAGGAGGT TACAGAGTTT CCATTTTCTG 
AAATGGGCGT CGCATATTTA ACTGAAGTTG 

SO 

TTCCAGAAGT TCAACATGCT IT AAA CAT CA 



GTAAATCTAA AGATAACCCy TATAGAGATT 1266 0 

CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

ATGAATATTA TTTACATTTA TTTGATGTCA 1278 0 

AAGTACGTCA ATCGTTATAT CGCATAGTCA 12840 

TTCGATTTGA TGTCATTAAC TTAATTTCTA 12 900 

GTAAAGAATT TTATACGGAT GGTCCTAGAG 12960 

AAACGTTTGG TAACACTGAC ATGATGACTA 13 020 

ATTGTATTAA GTATACACAA CCAGAACGCC 13 080 

ATCTAAAGGT TGATTATGTT GATGGTGAAA 1314 0 

AGTTAAAGGA AATTCTGATG CAATGGCAAC 13200 

CGATTTTCTG GTGTAATCAT GATCAGCCAC 13 26 0 

CGGAAGAGAT GAGGATACAA AGTGC7AAAA 13 32 0 

GGACGCCATA TATTTACCAA GGTGAAGAAA 13 3 80 

TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCTGAAGC GGATGTGTTA GCGATTTTAG 13 500 

CTATGCAATG GAGTGATGAT GTTAATGCTG 13 560 

TTTCGGAAAA TTATCATCAG GTCAACGTTA 13 620 

i<winiAUvjin i. v^nnnnA l nmv^mi inn x_xj<jv 

ACATTGTGCC ACGTTTTATG GATCATGATC 13740 

ATCAACAATG GCTAGTAATT GCGAATTTCT 13 800 

TGGCTAGAGA AGGTTGTGTT GTGATTCAAA 13 860 

GGTTTGGTGC AATTGTAATC GAAACAAACG 13 920 

TGGCGAAACA AAAAAAGTTT ATGAAGATTT 13 980 

GGCAGATTCA ATATGGTGAA CAAATTCCGT 14 04 0 

CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

AGATTCATGG TAAAGGGTCA CTTGTCATTT 1416 0 

AACTTGTTAG TTTTAAAGAA ATGCAAGAAG 14 220 

TTGTGAATGA GGTTGTTGAA GCGCATGAAG 142 8 0 

ATTCTAGTGA ATCACTCATT CATATTGTTA 14 340 



TTGTTTCAGA TATAGGTAAT GATGTTGCGA 
TATTAAATCT TAATATTAGT TATTCAAGTA 

5 AAGCATATCA ATTGTTTGGT GATGTATCGG 

TGTATTTAGA AAATACAATG CCGTTTCAAT 
TTAAATTTAA TGACTTCTCA AGACGTCGTA 

10 CTTGCAATTA ACTATTAAAA TATAGTAATA 

CGGTTCCCTG TACTCGAAAT CCGCTTTATG 
TTTTGCGAAG TCTGCCCAAA GCACGTAGTG 

75 

CCCATGAACC ATGTCAGGTC CTGACGGAAG 
AGGgTAGCCG AGATTTAGCT AACGACTTTG 
GGTGCACGGT TTTTTATTTT TTAAATATTA 

20 

TTATAGAAGC TACTTTCTTG AAGACAATTC 
AAGTAGCTTT TTTATATGTG AAGTTTGATT 
TTTTGTGTCA ATGAAAAGTA AGAAGTTATA 

25 

AGGGGGAGTA TCTTACAATA GAATTATTAA 
TGCCTACGGA GGACATATGC AAATATATTT 

30 ATCTTTAAAT AGTATTGAAG AAAGTTTTGA 
TGCGAAAGTA AAACATTTAA GAAAATCTCC 
GAAAAATGAA AATAACGATG TCGTTGGACA 

35 TGATGATAAG ACGTATTATG GTTTGGCGAT 
TGGACAAAAA TTAGGTCGTG GCTTGGTTCA 
GTATAGTACG GTTGTTGTAG ACCATTGTTT 

40 TGCTGCTGAG CATGACATTA AATTAGAATC 
ATGGGATAAT TTGACGGATG CACCACACGG 
ATTGTTCAAT TAAGAAGTAA AGGTATTATC 

45 

GGTGCTAACT TGAATTATCA AGCCTTATAT 
GTCGTCGGAC AAGAACATGT CACGAAGACA 
TCGCATGCTT ATATTTTTAG TGGTCCGAGA 

50 

TTTGcTAAAG CAATCAACTG TCTAAATAGC 
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GTGATTCTAT TTATGATTAT TTGGAAAAGG 14 4 60 

AGTCTATTAC TTTTGAACCG TTTGATGAAC 14 5 20 

TGGCTTATTC AGCAACAGTT CGAAGTATTG 14 58 0 

ATAATATTTC AAAACATCTT GCAAATGAAT 14 640 

TAAAGTAAAC AATGATATAA ATGATTTATA 14 700 

TATATCTTGC CGTGCTAGGT GGGGAGGTAG 14 760 

CGAGGCTTAA TTCCTTTGTT GAGGCCGTAT 14 820 

TTTGAAGATT TCGGTCCTAT GCAATATGAA 14 880 

CAGCATTAAG TGGATCATCA TATGTGCCGT 14 94 0 

GTTACGTTCG TGAATTACGT TCGATGCTTA 15000 

AACCGATTAT TAAGAGTTGA AAATATATAA 150 6 0 

AGCGTATTAT ACGTGGAACA TGTTTGTGGG 15120 

CAAGTGAACT CGATGTGCAG TTTGAATGAT 15180 

ATTTGATGAT AAAGAAATGA TGGTGAAATG 15240 

TGAGATACGT TATGATTATT GACAATCAAA 153 00 

AAGTACTTTA ACAGAGTTAG ATTATGATAA 153 60 

TGATAATCCT GAAACGAGTT GGCAAGCACG 15420 

TTGCTATAAT TTTGAATTAG AAGTAATAGC 15480 

CGTTTTATTA ATTGAAGTAG AAATTAATAG 15540 

TGCCTCTTTA TCAGTTCATC CTGAATTACG 15600 

AGCAGTAGAA GAGCGTGCCA AAGCACAAGA 15660 

TGACTACTTT GAAAAGTTGG GTTATCAAAA 15720 

TGGTGATGCA CCGTTACTTG TAAAATATTT 15780 

AATCGTAAAA TTTCCAGAAC ATTTTTATTA 15840 

ATGCTATAAT GAGAGGTAAT TGTTTATGGA 15900 

CGTATGTACA GACCCCAAAG TTTCGAGGAT 15960 

TTGCGCAATG CGATTTCGAA AGAAAAACAG 16020 

GGTACGGGGA AAACGAGTAT TGCCAAAGTG 16080 

ACTGATGGAG AACCTTGTAA TGAATGTCAT 16140 
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w 



15 



25 



AATAATGGCG TTGATGAAAT AAGAAATATT AGAGACAAAG TTAAATATGC ACCAAGTGAA 16 260 

TCGAAATATA AAGTTTATAT TATAGATGAG GTGCACATGC TAACAACAGG TGCTTTTAAT 16320 

GCCCTTTTAA AGACGTTAGA AGAACCTCCA GCACACGCTA TTTTTATATT GGCAACGACA 16380 

GAACCACATA AAATCCCTCC AACAATCATT TCTAGGGCAC AACGTTTTGA TTTTAAAGCA 1644 0 

ATTAGCCTAG ATCAAATTGT TGAACGTTTA AAATTTGTAG CAGATGCACA ACAAATTGAA 16 500 

TGTGAAGATG AAGCCTTGGC ATTTAtcgCT AAAGCGTCTG AAGGGGGTAT GCGTGATGCA 16 560 

TTAAGTATTA TGGATCAGGC TATTGCATTT GGTGATGGTA CGTTAACATT GCAAGATGCG 16620 

TTGAATGTCA CAGGTAGCGT ACATGATGAA GCGTTGGATC ACTTGTTTGA TGATATTGTA 16680 

CAAGGTGACG TACAAGCATC TTTTAAAAAA TACCATCAGT TTATAACAGA AGGTAAAGAA 16740 

GTGAATCGCC TAATAAATGa TATGATTTAT TTTGTCaGAG ATACGATTAT GAATAAAACA 168 0 0 

TCTGAGAAAG ATACTGAGTA TCGAGCACTG ATGAACTTAG AATTAGATAT GTTATATCAA 16 86 0 

ATGATTGATC TTATTAATGA TACATTAGTG TCGATTCGTT TTAG7GTGAA 7CAAAACGTT 16920 

CATTTTGAAG TGTTGTTAGT AAAATTAGCT GAGCAGATTA AGGGTCAACC ACAAGTGATT 16 98 0 

GCGAATGTAG CTGAACCAGC ACAAATTGCT TCATCGCCAA ACACAGATGT ATTGTTGCAA 1704 0 

CGTATGGAAC AGTTAGAGCA AGAACTAAAA ACACTAAAAG CACAAGGAGT GAGTGTCGCT 17100 

CCTGTTCAAA AATCTTCGAA AAAGCCTGCG AGAGGCATAC AAAAATCTAA AAATGCATTT 17160 

TCAATGCAAC AAATTGCAAA AGTGCTAGAT AAAGCGAATA AGGCAGATAT CAAATTGTTG 17220 

AAAGATCATT GGCAAGAAGT GATTGATCAT GCCAAAAATA ATGATAAAAA ATCACTCGTT 17280 

AGTTTATTGC AAAATTCGGA ACCTGTGGCG GCAAGTGAAG ATCACGTACT TGTGAAATTT 1734 0 

35 GAGGAAGAGA TCCATTGTGA AATCGTCAAT AAAGACGACG AGAAACGTAG TAGTATAGAA 17400 

AGTGTTGTAT GTAATATCGT TAATAAAAAC GTTAAAGTTG TTGGTGTACC ATCAGATCAA 17460 

TGGCAAAGAG TTCGAACGGA ATATTTACAA AATCGTAAAA ACGAAGGCGA TGATATGCCA 17 52 0 

40 AAGCAACAAG CACAACAAAC AGATATTGCT CAAAAAGCAA AAGATCTTTT CGGTGAAGAA 17580 

ACTGTACATG TGATAGATGA AGAGTGATAC ATGACAAGCG ATATAATCGT ATGTATAATG 17640 

AAAGAAACAT CATTTTATTG ATAAATATTT ATTGATTTTC AAGGAGGAAA TGGAATATGC 17700 

GCGGTGGCGG AAACATGCAA CAAATGATGA AACAAATGCA AAAAATGCAA AAGAAAATGG 17760 

CTCAAGAACA AGAAAAACTT AAAGAAGAGC GTATTGTAGG AACAGCTGGC GGTGGCATGG 17820 

TTGCAGTTAC TGTAACTGGT CATAAAGAAG TTGTCGACGT TGAAATCAAA GAAGAAGCTG 17880 

TAGACCCAGA CGATATTGAA ATGCTACAAG ACTTAGTGTT AGCAGCTACT AATGAAGCGA 17940 



30 



50 



TCCCTGGaAT GTGATCATAG ATGCATTATC 
TTATGAAATT GCCAGGCATT GGTCCAAAGA 

5 ATATGAAAGA AGACGATGTT GTTCAGTTTG 

TAACATATTG TAGCGTATGT GGTCACATTA 
ATAAGCAAAG AGATCGTTCA GTTATTTGTG 

10 TGGAAAAAAT GAGAGAATAC AAAGGTTTAT 

TGGATGGCAT TGGACCAGAA GATATTAATA 
ATGAAGTTAG CGAATTAATC TTAGCTATGA 

15 

TGTATATTTC TAGATTAGTT AAGCCTATAG 
TATCGGTAGG TGGCGATTTA GAGTATGCTG 
GTAGAACAGA AATGTAATkT CTTCTATTAA 

20 

AAGTCACAGT GTAATCATTG TGGCTTTTTT 
GCGGTGTGGC GGTGGTATGG TTTACCTAGT 
CAAGCCGTTG GTTGTGATTT GTTACTTCTA 

25 

TAGATCTATG GTTATGGTGT GTTGGTGCTA 
CAAATGAAAT TCTTTTGTAA TTGAAATGAT 

3Q GGTCTAAAGC TTATTAAATC AGCCTGTATA 

TAAATTTATT TTTAATTTCT GGTAAAAAAA 
ATATGGTTAG AGAAAAATCT GTTTCTTGTT 

35 TTTTTAAGTT CGATTTTTAG GATAAGGGCG 
ACTGfTGTTA AGCAGTTTGA AAGCCTGTAT 
CTCAACTTAA GAAATAACTT GAATTACTAA 

40 AAATGTTAAT AAAATGTATA ATTAATTCTT 
AATGACAATA TGTCAACGTT AATTCCAAAA 
GTATTTATGA GCTAATCAAA CATCATAATT 

45 GAACGCTGGC GGCGTGCCTA ATACATGCAA 
CTGATGTTAG CGGCGGACGG GTGAGTAACA 
ACTTCGGGAA ACCGkAGCTA ATAC CGGATA 

50 

AG ACGGT CTT GCTGTCACTT ATAGATGGAT 
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CAGAACCTAT ATCAAAACTT ATTGATAGCT 18 060 

CAGCCCAACG TCTGGCTTTT C ATAC CTT AG IB 12 0 

CCAAAGCATT AGTAGATGTT AAGAGAGAAT 13180 

CTGAAAATGA TCCATGTTAT ATTTGTGAAG 18240 

TTGTGGAAGA TGACAAAGAT GTCATAG CTA 18300 

ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18360 

TTCCTTCATT GATTGAACGC TTGAAAAACG 18420 

ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

ACATTTTTGA TTTTAATACT ATAGTAAGAA 18660 

TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

TTTACTGAGG GATGGGTAAT CTTTAGGAAG 18780 

ATAGTAATGA TGTGAATTGG ATTATCGAAT 18 840 

TTAATTTGAT AAATGCGGTT AATGACTATG 18 900 

AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18 960 

GCGGTGTTTT GAGAGATTAT TTAAAACTTG 1902 0 

TAACGTTCTG TTTTGCGTTT TTTTTGATTG 19080 

CTAAAAAACG TACTATTTAT AAGTGGGGAT 1914 0 

TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CGAAAATTAA TTTTAAAAAG TTATTGACTT 19320 

GTCGGTAAGA AAAATGAACA TTGAAAACTG 19380 

AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19500 

GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CGTGGATAAC CTACCTATAA GACTGGGATA 1962 0 

ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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GAGACACGGT CCAGACTCCT ACGGGAGGCA GCAGTAGGGA ATCTTCCGCA ATGGGCGAAA 19 860 

gCtGaCGGAG CAACGCCGCG TGAGTGATGA AGGTCTTCGG ATCGTAAAAC TCTGTTATTA 19 920 

5 GGGAAGAACA TATGTGTAAG TAACTGTGCA CATCTTGACG GTACCTAATC AGAAAGCCAC 19 980 

GGCTAACTAC GTGCCAGCAG CCGCGGTAAT ACGTAGGTGG CAAGCGTTAT CCGGAATTAT 2 0 040 

TGGGCGTAAA GCGCGCGTAG GCGGTTTTTT AAGTCTGATG TGAAAGCCCA CGGCTCAACC 2 0 100 

10 GTGGAGGGTC ATTGGAAACT GGAAAACTTG AGTGCAGAAG AGGAAAGTGG AATTCCATGT 20160 

GTAGCGGTGA AATGCGCAGA GATATGGAGG AACACCAGTG GCGAAGGCGA CTTTCTGGTC 20220 

TGTAACTGAC GCTGATGTGC GAAAgCGTGG GGATCAAACA GGATTAGATA CCCTGGTAGT 20280 

15 

CCACGCCGTA AACGATGAGT GCTAAGTGTT AGGGGGTTTC CGCCCCTTAG TGCTGCAGCT 2034 0 

AACGCATTAA GCACTCCGCC TGGGGAGTAC GACCGCAAGt TGAAACTCAA AGGAATTGAC 2 0400 

GGGGACCCGC ACAAGCGGTG GAGCATGTGG TTTAATTCGA AGCAACGCGA AGAACCTTAC 2 0460 

20 

CAAATCTTGA CATCCTTTGA CAACTCTAGA GATAGAflCPT TrCCCTTCGG GGGACAAAGT 20520 

GACAGGTGGT GCATGGTTGT CGTCAGCTCG TGTCGTGAGA TGTTGGGTTA AGTCCCGCAA 2 0580 

CGAGCGCAAC CCTTAAGCTT AGTTGCCATC ATTAAGTTGG GCACTCTAAG TTGACTGCCG 2 0640 

25 

GTGACAAACC GGAGGAAGGT GGGGATGACG TCAAATCATC ATGCCCCTTA TGATTTGGGC 20700 

TACACACGTG CTACAATGGA CAATACAAAG GGCAGCGAAA CCGCGAGGTC AAGCAAATCC 2 0760 

CATAAAGTTG TTCTCAGTTC GGATTGTAGT CTGCAACTCG ACTACATGAA GCTGGAATCG 20820 

30 

CTAGTAATCG TAGATCAGCA TGCTACGGTG AATACGTTCC CGGGTCTTGT ACACACCGCC 20880 

CGTCACACCA CGAGAGTTTG TAACACCCGA AGCCGGTGGA GTAACCTTTT AGGAGCTAGC 2 0 940 

35 CGTCGAAGGT GGGACAAATG ATTGGGGTGA AGTCGTAACA AGGTAGCCGT ATCGGAAGGT 21000 

GCGQCTGGAT CACCTCCTTT CTAAGGATAT ATTCGGAACA TCTTCTTCAG AAGATGCGGA 21060 

ATAACGTGAC ATATTGTATT CAGTTTTGAA TGTTTATTTA ACATTCAAAT ATTTTTTGGT 21120 

40 TAAAGTGATA TTGCTTATGA AAATAAAGCA GTATGCGAGC GCTTGACTAA AAAGAAATTG 21180 

TACATTGAAA ACTAGATAAG TAAGTAAAAT ATAGATTTTA CCAAGCAAAA CCGAGTGAAT 21240 

AAAGAGTTTT AAATAAGCTT GAATTCATAA GAAATAATCG CTAGTGTTCG AAAGAACACT 2130 0 

^ CACAAGATTA ATAACGCGTT TAAATCTTTT TATAAAAGAA CGTAACTTCA TGTTAACGTT 21360 

TGACTTATAA AAATGGTGGA AACATAGATT AAGTTATTAA GGGCGCACGG TGGATGCCTT 2142 0 

GGCACTAGAA GCCGATGAAG GACGTTACTA ACGACGATAT GCTTTGGGGA GCTGTAAGTA 214 80 

50 

AGCTTTGATC CAGAGATTTC CGAATGGGGA AACCCAGCAT GAGTTATGTC ATGTTATCGA 21540 
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GAGGAAGAGA 


AAGAAAATTC 


GATTCCCTTA 


GTAGCGGCGA 


GCGAAACGGG 


AAGAGCCCAA 


21660 




ACCAACAAGC 


TTGCTTGTTG 


GGGTTGTAGG 


ACACTCTATA 


CGGAGTTACA 


AAGGACGACA 


21720 


5 


TTAGACGAAT 


CATCTGGAAA 


GATGAATCAA 


AGAAGGTAAT 


AATCCTGTAG 


TCGAAAATGT 


21780 




TGTCTCTCTT 


GAGTGGATCC 


TGAGTACGAC 


GGAGCACGTG 


AAATTCCGTC 


GGAATCTGGG 


21840 




AGGACCATCT 


CCTAAGGCTA 


AATACTCTCT 


AGTGACCGAT 


AGTGAACCAG 


TACCGTGAGG 


21900 


10 


GAAAGGTGAA 


AAGCACCCCG 


GAAGGGGAGT 


GAAATAGAAC 


CTGAAACCGT 


GTGCTTACAA 


21960 




GTAGTCAGAG 


CCCGTTAATG 


GGTGATGGCG 


TGCCTTTTGT 


AGAATGAACC 


GGCGAGTTAC 


22020 


15 


GATTTGATGC 


AAGGTTAAGC 


AGTAAATGTG 


GAGCCGTAGC 


GAAAGCGAGT 


CTGAATAGGG 


22080 


CGTTTAGTAT 


TTGGTCGTAG 


ACCCGAAACC 


AGGTGATGTA 


CCCTTGGTCA 


GGTTGAAGTT 


22140 




CAGGTAACAC 


TGAATGGAGG 


ACCGAACCGA 


CTTACGTTGA 


AAAGTGAGCG 


GATGAACTGA 


22200 


20 


GGGTAGCGGA 


GAAATTCCAA 


TCGAACCTGG 


AGATAGCTGG 


TTCTCTCCGA 


AATAGCTTTA 


22260 


GGGCTAGCCT 


CAAGTGATGA 


TTATTGGAGG 


TAGAGCACTG 


TTTGGACGAG 


GGGCCCCTCT 


22320 




CGGGTTACCG 


AATTCAGACA 


AACTCCGAAT 


GCCAATTAAT 


TTAACTTGGG 


AGTCAGAACA 


22380 


25 


TGGGTGATAA 


GGTCCGTGTT 


CGAAAGGGAA 


ACAGCCCAGA 


CCACCAGCTA 


AGGTCCCAAA 


22440 




ATATATGTTA 


AGTGGAAAAG 


GATGTGGCGT 


TGCCCAGACA 


ACTAGGATGT 


TGGCTTAGAA 


22500 




GCAGCCATCA 


TTTAAAGAGT 


GCGTAATAGC 


TCACTAGTCG 


AGTGACACTG 


CGCCGAAAAT 


22560 


30 


GTACCGGGGC 


TAAACATATT 


ACCGAAGCTG 


TGGATTGTCC 


TTTGGaCAAT 


GGtAGGAGAG 


22620 




CGTTCTAAGG 


GCGTTGAAGC 


ATGATCGTAA 


GGACATGTGG 


AGCGCTTAGA 


AGTGAGAATG 


22680 




CCGGTGTGAG 


TAGCGAAAGA 


CGGGTGAGAA 


TCCCGTCCAC 


CGATTGACTA 


AGGTTTCCAG 


22740 


35 


AGGAAGGCTC 


GTCCGCTCTG 


GGTTAGTCGG 


GTCCTAAGCT 


GAGGCCGACA 


GcGTAGGCGA 


22800 




TGGA7AACAG 


GTTGATATTC 


CTGTACCACC 


TATAATCGTT 


TTAATCGATG 


GGGGGACGCA 


22860 




tAGGATAGGC 


GAAgcGTGcG 


ATTGGATTGC 


ACGTCTAAGC 


AGTAAGGCTG 


AGTATTAGGC 


22920 


40 


AAATCCGGTA 


CTCGTTAAGG 


CTGAGCTGTG 


ATGGGGAGAA 


GACATTGTGT 


CTTCGAGTCG 


22980 




TTGATTTCAC 


ACTGCCGAGA 


AAAGCCTCTA 


GATAGAAAAT 


AGGTGCCCGT 


ACCGCAAACC 


23040 




GACACAGGTA 


GTCAAGATGA 


GAATTCTAAG 


GTGAGCGAGC 


GAACTCTCGT 


TAAGGAACTC 


23100 


45 


GGCAAAATGA 


CCCCGTAACT 


TCGGGAGAAG 


GGGTGCTCTT 


TAGGGTTAAC 


GCCCAGAAGA 


23160 




GCCGCAGTGA 


ATAGGCCCAA 


GCGACTGTTT 


ATCAAAAACA 


CAGGTCTCTG 


CTAAACCGTA 


23220 




AGGTGATGTA 


TagGGcTGAC 


GCCTGCCCGG 


TGCTGGAAGG 


TTAAGAGGAG 


TGGTTAGcTT 


23280 


50 


CTGCG AAgCT ACGAATCGAA GCCCCAGTAA ACGGCGGCCG TAACTATAAC GGTCCTAAGG 


23340 
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TGTCTCAACG AGAGACTCGG TGAAATCATA GTACCTGTGA AGATGCAGGT TACCCGCGAC 2 3460 

AGGACGGAAA GACCCCGTGG AGCTTTACTG TAGCCTGATA TTGAAATTCG GCACAGCTTG 2 3 520 

5 TACAGGATAG GTAGGAGCCT TTGAAACGTG AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 2 3 580 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC CCGCACCACT TATCGTGGTG GG AG AC AG TG 2 3 640 

iCAGGCGGGC AGTTTGACTG GGGCGGTCGC CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 2 3 700 

W TTCCCTCAGA ATGGTTGGAA ATCATTCATA GAGTGTAAAG GCATAAGGGA GCTTGACTGC 23 760 

GAGACCTACA AGTCGAGCAG GGTCGAAAGA CGGACTTAGT GATCCGGTGG TTCCGCATGG 2 3 820 

AAGGGCCATC GCTCAACGGA TAAAAGCTAC CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 23 880 

15 

AGTTCACATC GACGGGGAGG TTTGGCACCT CGATGTCGGC TCATCGCATC CTGGGGCTGT 23940 

AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC ATTAAAGCGG TACGCGAGCT GGG TTCAGAA 24000 

CGTCGTGAGA CAGTTCGGTC CCTATCCGTC GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24 060 

20 

CTTAGTACGA GAGGACCGGG ATGGACATAC CT CTGGTGT A CCAGTTGTCG TGCCAACGGC 24120 

ATAGCTGGGT AGCTATGTGT GGACGGGATA AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 24180 

CCTCAAGATG AGATTTCCCA ACTTCGGTTA TAAGATCCCT CAAAGATGAT GAGGTTAATA 24240 

25 

GGTTCGAGGT GGAAGCATGG TGACATGTGG AGCTGACGAA TACTAATCGA TCGAAGACTT 24300 

AATCAAAATA AATGTTTTGC GAAGCAAAAT CACTTTTACT TACTATCTAG TTTTGAATGT 243 6 0 

^ ATAAATTACA TTCATATGTC TGGTGACTAT AGCAAGGAGG TCACACCTGT TCCCATGCCG 2 4420 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT GGTAGTc G AA CTTACGTTCC GCTAGAGTAG 244 80 

AACGTTGCCA GG CAAAAAAT GGATGCGATG AGCCGCATTG AGACCGCAAG GTCTCTTTTT 24 540 

35 TTTATGTCTA AAACGTCAAA ATAAAAAGCA AACACAAAGA AAAATGGCTT GGCGAAGTGA 24600 

AAACDTTTGA ATCTGACGAA ACGAGAAAAG ArCGCAACGA GTTTAGTAGA GCTAAATGAG 24 660 

TAAGyGAGAG CCGAAGrAGA GGAAAGAAGC AAGCGATTGT CACAAGTCAA GAAAGGTTCT 24 720 

40 TAGCGAsGAT GGTAGCCAAC TTACGTTCCG CTAGAGTAGA ACTGGAAATG ATAATTTAAT 247 8 0 

AATGTACACT TTCGATTGTC TAAGTATGTA CAACTTTAAT TTTGTGTTTA TATAAATTTA 24 840 

AAATGATATC ATCGAAAACA AAATATTGTA TAAATAGAGA AGAGCAGTAA GACGGTATCT 24 900 

AATTGAAAAT GATCTTACTG CTCTTTTATA TACTTTATTG AAATACAAAA AGGAAATTAA 24 96 0 

TTATTATACA ATAGACAAGC TATTGCATAA GTAACACTAA CTTTTATCAA AGAAGTGTTA 2 5020 

CTTTATAATT AATGATTTTA TTAGAGCGTC TACATGCGGT TTTAAAGCAT CATCGTCTAT 2 50 80 

50 

ACCGCCAAAG C CT AATATAA ATTTAGGGGT TTTCTTATAG TCTTGATCAT CATCAAAATT 2 514 0 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT ACCCGTTTCA GCACCTTGAA TATCAAGCTG 252 6 0 

CTCTTTGTAA GGTTTCAATC TTTTTAAAAT ATAGGTTAGT TTTCTACGAT AAATTCGTCT 2 5 320 

CATTTTATTT AAATGCCTTT CAAAACCACC GGAAGATATA AACGTTGCAA TAAGGTTTTG 253 30 

CATATGAACA GGTACAGTGT TGCCTTCAAT GTGATTTTGA GAATGATATT TTTTCATTAT 2 5440 

AGAATAGGGT AACACCATAT ATGCAACTCG ACAGCTAGGA AAAATAGACT TTGAAAATGT 255 00 

ACTGATATAA ATCACTTTTT CTCCTCTTGA ATATAGACCT TGAATTGCTG GAATGGGTTT 25560 

GCCGAAATAT CTAAACTCGG AATCATAATC ATCTTCTATA ATAAATCGTT CTTCTTTTTC 25620 

TTGAGCCCAT TGTATTAATT GAGTTCGTTT TTTTAAGTCC ATCACATATC CAGTTGGAAA 25680 

TTGATGGGAA GGCGTTATAT ATACTATATT TTTTTGTGAT TTAATAACTT CATCTACGTT 2 5740 

TATTCCATTA TCTTCAACTT CAATTTGTTC ATATTCAACT TGTTTTTTAT CTAAAATATT 25800 

TTTGATTGGT GGATAACTAG GTTTTTCGAT AATAAATGTT GAAGTATAAA GTAAAT CGAC 25360 

TAATTGATTT ACTAATTGTT CGGTAGATGA GCCAATTATA ATTTGATTAG GATCACAAAT 25920 

TACGCCACGA TTAGTAAATA AATAAAATGC CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 25930 

AAAATGTCCT CTACGTAATT GATTTAAATG ATTTGTATCA TAAAGATCTT TGGAATACTT 26 040 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT ATCTATTTCA TCCAAATTAA AAGCATAATC 2 610 0 

ATAAGCTTCA TCACTCGCTT TTGGTTTATA TGAATCATCA TCAAAAAGAG AGGGGATAGG 26160 

TTGATTGTTT AAAATTGTTA AAGATTCAAT TTCGGACACA AAATATCCAG AGCGAGGTCT 2622 0 

TGAATAAATG TAACCTTCGT CTAATAGAAG TTGATATGCA TGCTCTACGG TTGTTTGGCT 262 8 0 

AATAGATAAA TGTTTGCTTA ATTGTCTTTT AGAATAAAAT TTATCGCCTT CTTTAAATTG 2634 0 

35 ACCTTCAATT ATTTGTTTTT TTAATTTTTC ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 2 6400 

TTTXATAACT GACCTCCTAA ATTTATCTTA TTTTGTACCT TTTTAAATAT CAGTTTATAC 2 6460 

ATTACAATGT ATTTAATCAA CTTGAAAAGG GGTTTTATGT ATAATGAGTA AAATTATTGG 2 6 520 

40 ATCAGACAGA GTCAAAAGAG GTATGGCTGA AATGCAAAAA GGCGGCGTTA TTATGGATGT 2658 0 

CGTTAATGCT GAGCAAGCAA GAATTGCAGA AGAAGCTGGC GCGGTAgCAG TTATGGCATT 2 6640 

AGAACGAGTA CCTTCTGATA TTAGAGCTGC TGGTGGTGTT GCACGTATGG CAAACCCTAA 26700 

AATTGTAGAA GAAGTAATGA ATGCTGTTTC TATTCCAGTC ATGGCTAAAG CACGTATTGG 2 6760 

TCATATCACT GAAGCAAGAG TATTAGAGGC GATGGGTGTT GACTATATTG ATGAATCAGA 26820 

AGTGTTAACA CCAGCAGATG AGGAATATCA CTTAAGAAAA GATCAATTTA CAGTACCATT 2688 0 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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ACAAGTTAAT TCAGAAGTTA GTCGATTGAC TGTAATGAAT GATGATGAGA TTATGACTTT 27 060 

TGCGAAAGAT ATCGG TGCGC CTTATGAAAT TTTAAAACAA ATTAAAGACA ATGGTCGTTT 27120 

ACCGGTAGTT AACTTTGCAG CTGGTGGCGT TGCGACTCCT CAAGATGCTG CTTTAATGAT 27180 

GO AATTAGGT GCTGACGGTG TATTCGTTGG ATCAGGTATT TTTAAATCAG AAGATCCAGA 27 240 

AAAATTTGCT AAAGCAATTG TTCAAGCAAC AACACATTAC CAAGACTATG AACTAATTGG 27300 

AAGATTAGCA AGTGAACTTG GCACTGCTAT GAAAGGTTTA GATATCAATC AATTATCATT 273 6 0 

AGAAGAACGT ATGCAAGAGC GTGGTTGGTA AGATATGAAA ATAGGTGTAT TAGCATTACA 27420 

AGGTGCAGTA CGTGAACATA TTAGACATAT TGAATTAAGT GGTCATGAAG GTATTGCAGT 27480 

TAAAAAAGTT GAACAATTAG AAGAAATCGA GGGCTTAATA TTACCTGGTG GCGAGTCTAC 2 7540 

AACGTTACGT CGATTAATGA ATTTATATGG ATTTAAAGAG GCTTTACAAA ATTCAACTTT 2 7600 

20 ACCTATGTTT GGTACATGCG CAGGATTAAT AGTTCTAGCG CAAGATATAG TTGGTGAAGA 2 7660 

AGGATACCTT AACAAGTTGA ATATTACTGT ACAACGAAAC TCATTCGGTA GACAAGTTGA 27720 

CAGCTTTGAA ACAGAATTAG ATATTAAAGG TATCGCTACA GATATTGAAG GTGTCTTTAT 27780 

25 AAGAGCCCCA CATATTGAAA AAGTAGGTCA AGGCGTAGAT ATCCTATGTA AGGTTAATGA 27 840 

GAAAATTGTA GCTGTTCAGC AAGGTAAATA TTTAGGCGTA TCATTCCATC CTGAATTAAC 27900 

AGATGACTAT AGAGTAACTG ATTACTTTAT TAATCATATT GTAAAaAAAG CATAGCTTAA 27960 

TGTATGCTAA ATCAACGAAT TATTGATATT TATAGATTTG TTGAGAAGAA AATATCTCCT 2 8020 

TCAAACTTAG CTTTGGAGGA GTTATTTTTT ATGTCAAAAT TAAAAATGAT AAAAAATAAA 2 8080 

GCTATACATA AGAAAAAAAC CCTTCAAAGA GACTGAGAAT AGTCAAAATT TTGAAGGGGT 2 814 0 

TAATTCGATG TTGATGTATT TGTTAAATAA AGAATCcAGC GATTGCAGCT GAAATGAAAG 2 8200 

ATACTAGTGT tGCACCGAAT AATAATTTCA AACCAAAGCG GGCAACTGTA TCTCCTTTTT 2 8260 

TGTCATTAAG TGATTTAATC GCACCTGAAA TAATACCGAT AGAGCTAAAG TTAGCAAATG 2 8320 

ATACTAAGAA TACAGATGTA ACACCTTTTG CGTGTTCAGA TAAATCACTA AGTTTACCAA 2 83 80 

GTGCTTGCAT TGCTACAAAT TCGTTAGATA ATAGTTTTGT CGCCATAACT GAACCGGCTT 2 8440 

GAACTGCATC TTGCCATGGC ACACCGACTA AGAATGCAAA TGGTGCAAAG ACAAAACCAA 2 8500 

TTAATGTTTG GAAATCCCAA GAAATAGCGC CACCTGAAAC TGTACTAAAG ATATTGCTTA 2 8 560 

CAATTCCATT TAATAGAGCG ATAATGG CAA TGTATCCGAT TAACATTGCG CCTACAATGA 2 8620 

CAGCTACTTT AAATCCATCT AAAATATATT CTCCTAGCAT TTCGAAGAAT GATTGTTGTC 2 8680 

-^^^P^Y ^T>r^TrMiCT AATAATTTGT CATCTTCTTr ATTAA r TT T, A ^AAGGGTTAA -0-74- 
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TAGGTTCAAT 


TAAGGTAAAG 


TATGCACCGA 


TAATTGAAGC 


AGAAACAGTC 


GACATTGCTG 


28860 


AAG CTGTTAA 


TGTGTATAAA 


CGTTGCTTAG 


GTATGTATGG 


TAATTGTTTT 


TTAATTGAAA 


28920 


TAAATACTTC 


AGATTGTCCC 


AAAATTGCTG 


CAGCAACTGC 


ATTGTATGAT 


TCTAAACGTC 


28980 


CCATACCATT 


AATTTTAGAA 


ATTAAGAATC 


CTAAAACATT 


AATGATTAAA 


GGTAAAATCT 


29040 


TTGTGTATTG 


AAGGATACCG 


ATAATCGCTG 


AAATAAATAC 


GATAGGTAAT 


AATACACTGA 


29100 


AGAAGAATGG 


TGGTTGCTTA 


GGATCGATAT 


ATTGAATACC 


ACCGAATACA 


AAGTTAACAC 


29160 


CATCTGCTGC 


TTTTAATAAT 


AAGTAGTTAA 


AACCGTTTGA 


AATACCACCA 


ATAACCTTGA 


29220 


TTCCCATTGT 


AGTTTTAAGC 


AAGATAAATG 


CAAAGATAAG 


CTGAATTGCA 


AGTAAAATTC 


29280 


CTACATATTT 


CCAGCGAATA 


TTTTTCCTGT 


CTGAGCTAAA 


TAGAAACGCA 


AGTGCTAAAA 


29340 


AGAAGATAAT 


TCCGATAATC 


CCAATTAGAA 


T ATG CAT AT A 


TTTCTCATTC 


CTTTAGTTTT 


29400 


TTCTACaATc 


TATCATACAA 


TAAAATGGAA 


GGGCTAACAT 


CATAAATTTT 


TGAAAATATA 


29460 


AAAACAAATT 


AATTGAAAAA 


GGTCAAAATA 


GGTCATATAA 


TATAGTCAAA 


GAAGGTCAAA 


29520 


AAGGGGTGAT 


ATACATGCAC 


AATATGTCTG 


ACATCATAGA 


ACAATAaTCA 


AACGTTTATT 


29580 


TGAAGAGTCG 


AATGAAGATG 


TCGTTGAAAT 


TCAGAGAGCG 


AATATCGCAC 


AGCGTTTTGA 


29640 


TTGCGTACCA 


TCACAATTAA 


ATTATGTAAT 


CAAAACACGA 


TTCACTAATG 


AACATGGTTA 


29700 


TGAAATCGAA 


AGTAAACGTG 


GTGGTGGTGG 


TTACATCCGA 


ATCACTAAAA 


TTGAAAATAA 


29760 


AGATGCAACA 


GGTTATATTA 


ATCATTTGCT 


TCAGCTGATT 


GGACCTTCTA 


TTTCTCAACA 


29820 


ACAAGCTTAT 


TATATTATTG 


ATGGGCTTTT 


AGATAAAATG 


TTAATAAATG 


AACGTGAAGC 


29880 


TAAAATGATT 


CAAGCAGTTA 


TTGATAGAGA 


AACGCTATCA 


ATGGATATGG 


TTTCTAGAGA 


29940 


TATTATTAGA 


GCAAATATTT 


TAAAACGTTT 


GTTACCAGTT 


ATAAATTATT 


ACTAAATGAA 


30000 


ATGAGGTGTT 


GAAGTGCTTT 


GTGAAAATTG 


TCAACTTAAT 


GAAGCGGAAT 


TAAAAGTTAA 


30060 


AGTTACAAGT 


AAAAATAAAA 


CAGAAGAAAA 


AATGGTGTGT 


CAAACTTGTG 


CTGAGGGGCA 


30120 


CCATCCGTGG 


AATCAAGCTA 


ATGAACAACC 


TGAaTATCAA 


GAACATCAAG 


ATAATTTCGA 


30180 


AGAAGCATTT 


GTTGTTAAGC 


AAATTTTACA 


ACATTTAGCT 


ACGAAACATG 


GAATTAATTT 


30240 


TCAAGA 












30246 



(2) INFORMATION FOR SEQ ID NO : 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: ! 
TATTCCCCCA TCGGTTTATT AAATCGTCCA 
TAAATCCATT TCAAACGCTT GGACGATATC 
TTTTAATAAT GCTAACTTTT CAACTATGTC 
CTCAAAATCT TTAGAGCCGC TTCGACTACT 

10 

TAATCGTGTA TTCACTTCCA CGGGTAATTC 
TGGTGAAATA TGTTCAACTA TTTGTTGAAG 

15 TTCAAATGCA TGATCAATTT TAAGATCATC 

TACAAATGAA AAATGACTTA ATTTATAGTC 
TGGGTACAAC TTACGTAGCA AAATAGCAGT 

20 ACCAAAGCTG GTTTTCAAAG GTATAGATTT 

TACTGACACT AACTCTGTGT ATGAAATCGT 
GATACGATCA TGTGCCATCA CAACGTAGTC 

25 AAAATAACTA GAGGCTAAGT AATTCTCCGC 

CGGTACCATT ACTACTTTCG TACCTTTTTC 
ACCAACAGCT TCATGAACTA ATGACATTGG 

30 

TCGACCTGTG TAATACCTTT GATCAGCTGC 
GACATGATTA CCATAAATAT CAACATTATT 
GAGTTGATAT ACTTGATTAA TCATCGGCAA 

35 

TTAAATCATA CGGTGTTGTC ACTTTAATGT 
CAT^TCCAGA TTCGACAATG ATTTTACATG 

40 TTAAGGCGCG ATAACTATCT TGTAATAATT 

GATACATTTC ATTCCTTACA GGGATACTGT 
TCGTATCAAT TGCTTCAATG ACTGTATCTA 

45 TCTCTTTAAT AATACGTTGA GTTAAAAATG 

CATC ATT ATT AATTCCATTT ACATTGCGAA 
TTCGATCCGT ACCACCTGCA ACTACTTTGA 

50 

AAATATCCTG TGTATGGGAA ATCCACTGTG 



jEQ ID NO: 57: 

TTTCAATACT GTTTTTCCCC AAGATGTCGA 6 0 

TTGCATCGTA CATACATTAA TTTCATGTCC 120 

TGGGTACTTA CGATATAAAT CAACAACTTG 180 

ACCAATCAAC GTTAATCCTT TTTCAAGTAC 24 0 

ACTTACGCCT AACAAAGCAA TACTGCCTTC 3 00 

TGCAACTTGA CTTCCTTTAC CTCCAACACA 360 

TGGTATTTGA TTTACTGTAA AGATGT CAT C 42 0 

TGTCTTACCA AATACATAAG TTTTAGCTTC 4 80 

AATATAACCT AAGTTACCAT CACCCCAAAT 540 

ACGTTCAAAT CGTTGTATAG CATGATAACT 600 

ACTCAAATCA ATGTCATTAG GCAGCGGAAC 660 

TTGCATAAAA CCATCATAAC CACTAGATCT 720 

AATAATATGA TGTTGCTCTG TAGGTGTATT 780 

AAATACCCCT TTACTATCAA ATACAACTTC 84 0 

TAATTTTTTG CGTAGTACAT TTTCATCTCT 900 

ACAAATAGAC AAGTATAAAG GTCTTACGAT 96 0 

ATATGTGACG TCGAACTGTC TCGGTGCAAC 102 0 

TATCACCTTG AATAATGGCA TTTGCTACTT 10 80 

TGTATAGTTC TCCaCGTACC AATTTAACTG 114 0 

CATCTGATAA GATTTCTTTT TGTTCACTAC 120 0 

TAATATTAAA TGATTGTGGT GTTTGGCCTT 126 0 

GTATGTTCTG TTTATCTTTA GACATTACAA 13 20 

CTGCACCATA TTTTGCTGCT ACTTCAATGT 138 0 

GTCTTACGGC ATCATGAGTT ACAATCACAT 144 0 

TATGGTCGAT AATGTTCATA ATTGTTTCAT 150 0 

CACGTTGATC TGTAATGTTA TATTTTTTTA 156 0 

CTGGCGTTGC GATAATAATC TCATTAAATT 162 0 
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CTGCATAAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTACTG 1300 

ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA 1860 

5 

TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 

AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 19 SO 

AATGTTTACT CTTTTTCAAA TTCATTATTA CTGCCATCAT TTTACCATAT ATTATAATAA 2 04 0 

10 

ATTTATCTTA TTAAGTGGCT GTACTTGATT TTCACTTTAA AAATTATCAA ATATTGCCAT 2100 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 2160 

15 ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 222 0 

ACGGCATTCG CACTTTCATA GCTATAACTA TACCAGCGTT TTCGTCCTCA AAGGTGCATA 2280 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTATCAAA 234 0 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 24 00 

TTTCGTCTTC ATATAATGTA AGGTTGCCGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 24 60 

ACAGTTCCAA GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 2520 

25 ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTCGCCAC CTTTTTTAGG 2580 

TAGCGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 2640 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 

30 

TTGATGTGCG CCCAATGATG TTGCAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 276 0 

GATATGTGCA GCACCAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2320 

^ CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 23 8 0 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2 94 0 

AACATTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3000 

40 GACGAATAAC TTTTCACATC GTGCATATTC ACCTTTTAAA CAATACTCGC ATTGATAACA 3 050 

AGGTATTGCT GGGCAACCTG TCACTTTGTC GCCCACATTA ACATG CGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC ATACCTTTAA TGTATGGCCC 3180 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCGCTCGTA CTTTAATAAT 324 0 

AACGTCATTC GCACTTTCAA TGACTGGCTT TTCATTATCC TCATACCGTA AATCTTCCAC 3 300 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

50 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACG 3420 

AGTAAATGTT CCATATAAAA ATCAGTGATT TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 3480 



55 



EP0 786 519 A2 



TGCTTTAATA CCTTCGCCGG ATTTTAAATG 
ATATATTTTT GTCACCAAAG CTTCAGCATT 
5 AGGTTCCCAA TCTGCTGGCT TTTGACTTCT 

CACTTTTTCC ATATCAAATG GAATTTCAGC 
ACCTTTTTTC CGTAAAATAT CCAAACCTTG 

w 

AACAACAACA TCTGCACCGT AACCGTCTGT 
TTGTTGTAAA TTGACTACAT AATCCATGTG 
GTCATTGTCC AATCCAGTTA CCACAACAGT 

15 

AAGTAATCCG ATTGGCCCAG GTCCCATTAC 
CTTAGAAACG CCATGATGTG CACATGCTAA 

20 TAtTCGTCTG GAATATGATG CAAACTTTCT 

CCATCAACTT GTGTTCCAAT ACCTTTTCGA 
CAGTATTCAC ACTCATTACA AACATAGAAT 

25 TTAAAATCTT TAACGTCTGC TCCAACTTCA 

GTCACTGGAA AATTAACTTT ATAATGACCT 

CCTGCATAAT GTACTTTAAT CTTTACTTTA 

20 

AGAAGTTCTA AGTTGCCATG TCCTTCTCTT 

TCGATTTTTA ATTGAATAGA CTAAATAGTT 

CTTGATCAAT ACTTGAAATT TCAGATGAAC 

35 

CCATATCTGT GAAAATGGGT GCTACGTCTG 
TCGTACCCAC AATGACAGAA TGAATAATGT 
CAACAAATGG TATCGTTGCT AAGTCACCAA 

40 

CGGCTAATAA AACAGTGATA GGTACTAAAA 
CTAATGCTAC AGCCGCATCC AATCCAATAT 

45 GCCATGTTCT TGCAGACTCT GAAACTGGCA 

7AGGCATTAA TACCATTACT GCAGCCATTG 
TGTAACCTGC TAACACACCA ATACCTAAAC 

50 CAAATGCGCC AAAACGTTTT TGAATTGTTT 

GTACTTTTTG TAACAATTTA ACTAAGTAAA 



TTGATACGCC TCGTCCCATT TCGAAATATC 3 6 00 

TACTAAACCA TCCGCCATAA GTTGCAATGA 36 60 

ACTACCAACA ACTGTTATTT CT'TTTTGAAT 3 72 0 

ATCCTTAAAA ATACCTATTT GACTGTAGAA 3 78 0 

TCGTGCTGCT GGAACTGCAC CTGAACATTC 3 34 0 

AATTCCATTG ATATACGTTT TTAAGTCTGT 3 900 

CAATGCTTCT GCTTTATCTA ATCTGACTTT 3 960 

TGCGCCTTTA CTTTTTAACA CTTGTGCTAC 4 020 

AACTGCTACA TCGCCTGAAT TGACTTGAAT 4 080 

TGGTTCTGTC ATAGCTGCAG ACTGATACGA 4140 

TCACGTGCAA TGACATAATT AGTAAATGCG 4200 

TGGTTGCATA AATTATAGTC TTTTGATTTA 4260 

GTCGTTTCAG aTGtGACACG GTCACCAACT 4 320 

ACGATTTCAC CAGAAAATTC ATGACCTAAT 4 3 80 

TCATAAGTAT GAATATCTGT GCCACAAATT 444 0 

TCATCTAGCG GTGTTGCAAC TTCTTTATCA 4 500 

GTTTTTACTA AAGCTTTCAC CACAAACACC 4 560 

TAAAGATAAG ATAGTTAACG ATATTACCAC 4 62 0 

CTTTTGGCAT TTGTACATTC GTACCTTTCG 4680 

TTGCAATATA TAGTGAAATT GCAATCATAA 474 0 

TTCCTCTTGC TGCACCAACA ATAAACGCGA 4800 

AAGGTAGTAC TTGGTTTCCT GGTAAAATAA 4860 

TTAATGCTGT CGAAATAACT GCTGGATGAC 4 92 0 

AAATTTCACG TTCGCCAAAA CGTTTATTTA 4 980 

TTAAACCTTC CATTAAGATT TTTACCATTC 504 0 

ACATTCCTAA ATTAATGATG TCTCCAGGTT 5100 

CTAAAATTAA GCCGACAAAT ATAGACTCTC 516 0 

CAGGATCAGC ATCTAACTTA TTCAGACCGG 52 2 0 

TACCTGGTGC ATAAGAAATT GTACTTCCTG 52 8 0 
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CTACTTTCAA ACAGATAATT TGGAAAATAA CTGCTGCTAA TAACGCTTGC CAAATACTGC 54 0 0 

CTGATACGGC ATAAACCATT GCTGCTGTAA ACGTATAATG CCAAAAATTC CAAAT AT CT A 54 6 0 

5 

CATTCATCGT CTTTGTCACT TTAGTTACTA GCAATACAAC GTTAACTATG ATTCCGAGTG 552 0 

GAATAATAAA TGCTGCGACA GATGATGCCC AAGCGATAGA TGATGTTGCT GGCCAACCTA 558 0 

CATCAATCAC ATTCAGACTG ACGCCTAAAT TTTTAACCAT CGCTTGTGCT GCTGGCCCTA 564 0 

10 

AATTTTTAAC TAATAAATCG ATGACTAAGA AAATCCCTAC AAAAGCCACA CCTATTGTTA 570 0 

AACCAGACCT AAATGCCGCT CCAATTTTCT GCCTAAAGAA TAGGCCAAGC AAGAATATGA 5760 

15 CAACCGGTAA AATAACAGTt GCACCTAAAT CTAAAAATCC CCTTACAAAA TCAGTGAAGT 582 0 

AACTCATATT TAAACCCTCC CTGTTATATA TGCATTGTCA CGATACTTTC CGATTGTGAT 58 8 0 

TACATTTGAC GTTACAGTCA TTTCAACGAC AACCCTTGCT AAATTCGACT GCAGTCCTTT 5 94 0 

20 TGAATTACAG tCACTGCGTT TCTATGTCAT CAACAATCAT TTGTCGTGAT AGTCATTTAT 6 00 0 

ATGCAATTTG CATATATTAA TATGTTATCG ACCCACGTTA CATATCAATT CCGTTATTTT 6060 

TGTAACTCTG TTAAGATTTG TTGTTTTGTT TCTTCAATAC CAATACCAGT TAAGAAATTA 612 0 

25 

CGTGCGTTGA TAACTGGGAA TTTATATTCT TTTTTTGTCA TTGCAGTTGT AACTAATAAA 618 0 

TCTGCAGTGT CTTCATAAGG TCCAACTTCT GTAATTTTGA TTTGTTTAAT ATCTACTTTA 624 0 

ATATTGTGTT CCTTTGCCAT TTCTTCAATT GCATTATTTA CTACTGTTGA CGTTGCAATA 6300 

30 

CCTGCACCAC ACGCTACTAA TACTTGTTTC ATTTTCAATT CCTCCAATTA ATTTTTAGTT 63 6 0 

ATATTCCAAA TAATCATTGA TTAGTGTTGC TAAAATTGTT TCATCTTTCG TTCGTAGAAT 64 2 0 

35 CTGCTCCAAT TTTTCTTCAC TTTGAAAAAT TTGCATCAAC TGTTGTAACA GCTTAAGTTG 64 8 0 

AT CATC TACT TTATCCATTG CTAACATAAA AACGATTTTC ACTTCTGTCT GTTGATCAAG 654 0 

TGTTCCCATT TCAATAAACG GCACTTCTTT TTCTAGAACA GCCACACCTA TCGTTCTATG 6 600 

40 GTTAATATGT TCGACATCTG TATGCGGTAT AGCGACCGAA CATAGATGCG TTGGTAAACC 6 660 

AGTAGCAAAT TCTTTTTCTC TGTCGATGAC TGCATCTTTA AACGTTGACT TCACGAACCC 672 0 

ATTTTGAAAT AACACATCTG ACATTTGTGA CAATACGGAT TCTTTATCAG TTGCCGACAA 67 80 

45 ATTGAGCATT ATATTTTCTT TATGCACTAA TTGCTGTCCC ATCCATTTTC CCTCGCTTCT 684 0 

TTATTTGAAT AATTTTTTAA AATCTCATTT ACATCAGAAT TTTTGCGACT TTGTATGATG 6 900 

CGCTTAATTG CGTCATTGTC TTGCGCCACA TCTCTCAATT GTAGTAACGC TCTTAAGTGT 6 96 0 

50 

GTCACTTTAT CAACAGCAGC AATAGGTACA ATAATATGGA TTGCTGTGCC ATCTGACATG 702 0 

TATATTGGTT CTTGTAATAT CAACATACTC ATCGCTGTTT TATGTACATG CTTTTCAGAG 7080 
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TGCATCTCAT GAATATATTT AATATCAATA AAATGATTAG CAACTAACAC ATCACTTGCT 72 00 

TTAGCAATAG CTTCATCAAT ATTTTCAACA TGATGCATTC TTTTCACGTG CCTTGCCGGT 7 260 

5 

ATCAAGTCAG CTAAATCTAA TGyCTwATTT tGTGtGACaA TCGATCCATT AATGGTTGAA 73 2 0 

ATTGAATTAT AATTGGCAAT AAAATCTTCT AAACCATCAC GTAGTcTGTA ATGTCATTAA 7 3 80 

CTGTCGTTGT GCGTTCAATT AATGCCATTA ACTTGTTTAT TTCCTTATCA ATGTCAGCCG 7440 

w 

ATTCCTTATT AATGTACTTC ATCACTTCTT TACGTAACTT TCGTTGCTCA TTTTCAGATA 7 5 00 

AAGCTACTTT TGTGATAAAT AATTTTTTAT GTGTTAGGAC AAACATTGGT GAAAAGACGA 7560 

15 TGTCATAATC TAATGTGTAA TTTTCAAATG TTCTAAGTGA AATCGCATCT AAGAAAATAA 7620 

TTTCTGGAAA TAAGTTTCGC AACTCGTATA ACATCATTTG TGATACTGAC GTGCCTTGTG 7 6 80 

TACACACGAT AATAGCTTTT ATCTTGCCAT CGAAGTTTTC ATCTTGACGT CTCAAACTAC 774 0 

20 CTCCGAACAA CATGGTTAAA TATGCTATTT CATTATCAGG CAACGATTTT CCGAAATATT 7 800 

CAGTTAACGA TTGACATGAT TGTTTCACCA TATGAAATAA GGATTGATAA TTTCCTTGTA 7860 

AAGGATTTAT TAATTCATCA CGATCCGTTA AGTTATATTT AATCCTATAA AAAGCAGGCG 792 0 

25 TTAAATGTAA CAAGAGTTGC TGTGATAATT TCTCCTTATC TTCAATGTTA ATAAAAGTGA 7980 

TTTGTTCAAA ATGGTGAATC ATTTGAGCGA TGGCCATCGT TAAATTCGAT ATGCTATCTG 8040 

ATTCTTGCAA ATCAGTCCAT TGCACACTTG TTGAAAGTAA GTGTAATGTC AAATATAACT 8100 

30 

TTTCCGCTTC TGGCAAATCC GGCTCATGTT GCGTCATAAT CTCCGTTGCT TGATATTCTT 8160 

TCGTATCCCT CAAATACTGA TAATTAATAT TTAATGGATT CATCACATGA CCACTTTGAA 82 20 

TTCGTCTACG AATCACACAA AGGACATAAG GCAATGAACT AAGTGATTTG TCTATAAAGC 8280 

35 

GACTCTTCAA AAATTGTTCT AC CTGTTTGA TCTTGTCTTT TTGATATGCG ATATCTTCGA 8340 

ATC&AAGTT GAGCGCCTTT AAAACTTCAC TTTTAGTAAT ATCATGATTC AACCTTTGAT 84 00 

40 CAATCAACTT AATGAAGAAA CGGCGAACTT CAAATTCATC ACCAACAATT TCATAACCAT 84 60 

GTTTTCGAGA ATACTTAAGT GACAAACCAT GATTTTCCAA TTGCTCTTTC ACATGATTTA 8 52 0 

TATCGTGAAT GACAGTATTT TTACTGACTT GTAAATCAAT TGAAAAATGG TTTAGAGACA 8 5 80 

45 TTGCGTTTTC CTTACTAAAA AGCATGAGCA TTAAATAATA ACGACGTGTT TCTATGCTAA 8640 

AAATGACATT GTTGCCGTTT AACATTTGCT GCTCCGATAC ATCTCGCTTG AATAACGTCA 87 00 

TGATTTCAGA ACTTACAATA AAATTTCCTT GGCTTGTTCT TTCAAGTTTT GGATAACCCT 87 6 0 

50 

CTTGTTCAAG CCACAAATTG ATTTTTTGAA TGCGATATCC TAGTTGTCTA CGAGACAAAC 8 820 

CAAATATCGA TTCAAGTTCT TTACCATGAA TAGTAGGATT CAATACAATT TT^T^A^A oqq- 
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TCAATCGTCA CACCGATGTA CACACTTTGA ACACATATTT TCAAAATGAG CATGTACATC 90 0 0 

ATTGTGATGT TTTAACAACA TTTCAATTAT ATCTATATTT TTTGTGATTT TAATCTTTTA 9060 
AAATAAAGCA ATTGAAATTT TTGCATATAT TTTTGTGTTT TGTGTTTTTT TGAAGCATTT 912 0 

TTAACATACA TATCTCAATC ATTATCAAAT TGTCATGACC ATTGTAACCC AATACAAAAA 9180 
CCCTAAGGAC GCTTATATCA GGCGCCTTAG GGTTAACTGT ATCTATTTAA TTAAGTATTA 924 0 

TTATTCGTAT GTACGTAACT TATGGTCTAT CAAGTTCCAC ACTTCTTCAA CATCAACTGC 9300 
TGTAGCAAAA TAAGCATTGG CAGGCTTACC TGTAACATGA TTTAAATCGA CAGCCATAGT 93 6 0 

GCCATAAGTT AGTGGACTTT GATGTTCAAT GTCGATATTA ACGGGTACCA TTGTAAACAA 9420 
TTCTGGTTGT AACAAATACA AAATTGTACA AGCATCATGT ATTGGACCAC CATCCATATT 94 80 

AAAGTGAGTC TTGTATGTCT TCTTAAAGAA TTGCAATAAT TCTACGACGA ACTGTGCAAC 954 0 

20 AGGATTATTG ATACTTTCAA AGCGTTCAAT CACGTGATCG TCGGCTAAAA CTTGATGTGT 96 0 0 

TACATCTAAA CCAAACACAT TTATAGTAAT CCCACTTTCA AAAACACGCT TCGCTGCTTC 9660 
AGCATCTACC CAAATATTGA ATTCTGCTGT AGGCGTCCAA TTTCCAAATG TACCACCACC 9720 
CATCAAAGTA ATAGATTCAA TATGCTCAGC GATTCTTGGC TCACGAATCA ATGCCGTTGC 97 80 

TACATTCGTA AGAGGACCTG TCGCTACAAT TGTTACAGGT GTATCACTCG TCATCACTTT 984 0 

GTTTATAATC ACATCTGATG CTGGCATTGC AACTGCTTGA CGTGATGGTG TCGACGGTAG 9 900 

TTTCGGACCA TCTAATCCAG ATTCCCCATG TATTTCAGAA GCAAAGGCAG CTGGTTTAAT 9 96 0 

TAACGGCCTA TCCGCACCTT TCGCTACTGC TATATCTTGG CGTCCCATAA TATCCAATAC 1002 0 

GTTCAAGGCG TTTGTCGTAT TCTTGTCAAC TGATTGATTA CCTGCGACTG TTGTTACAGC 10 0 80 

TAATATCTCT AGTGGACTGT CAATTGCCCC CGCTAAAATT AATGCTATTG CATCATCGTG 1014 0 

TCCTCGATCA CAATCCATAA TAATCTTTCT TTTCATTTAT ATATCCACCT TTCTTAAGTT 10200 

GTTATCGATA GCTTATGTAT ATTTATTTAT GTGGTGAATC ATGTTTATTT TGAAAAATAG 10260 

TTTTAACTTT CTCATATTTT TGGATACAAA CACTATTTAT CTATTTTATG GCTTATAAAT 10320 

TTATCCGATA TGCCTTATCA ACCTACCTCG CTAAAAATAG GATGTCTACA TATCTATACC 10380 

45 GACTTTTGTC AACTCATTTT CACAACAATA TAAACAGCAA TTTATATGAT TGTTACATGA 10440 

TTCAAACAAT TTTTATGAAA AATATTTTCA TACACAGAAT ATATATTGAT ATT AAATTT C 10 500 

TCAAAAGCTA TATTGAGAAT AATTAGGAGG GATGTTGATG AAATCTTTAT TTGAAAAAGC 10560 

ACAGCAGTTC GGCAAGTCCT TTATGTTACC TATCGCAATC TTACCAGCTG CAGGTCTATT 10620 

GTTGGGTATC GGTGGTGCAT TAAGTAATCC AAACACCGTT AAAGCATACC CTATTTTAGA 10680 
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AAATTTACCG 


GTCATCTTTG 


CAATTGGTGT 


CGCAATCGGA 


TTATCTAGAA 


GCGATAAAGG 


10800 




TACTGCAGGT 


tTAGctGCGC 


TGCTCGGTTT 


CTTAATTATG 


AACGCAACTA 


TGAATGGCTT 


10860 


5 


ATTAACTATC 


ACGGGCACAT 


TGGCAAAAGA 


TCAGCTTGCA 


CAAAATGGAC 


AAGGCATGGT 


10920 




GCTCGGTATA 


CAAACGGTTG 


AAACCGGTGT 


TTTTGGCGGG 


ATTATCACAG 


GTATTATGAC 


10980 


10 


CGCAATACTT 


CACAACAAAT 


ATCACAAAGT 


GGTATTACCA 


CCGTATTTAG 


GTTTCTTTGG 


11040 


TGGCTCTAGA 


TTTGTCCCTA 


TTGTCACAGC 


ATTTGCCGCA 


ATCTTTTTAG 


GTGTATTGAT 


11100 




GTTTTTCATT 


TGGCCAAGCA 


TACAAGCCGG 


CATTTATCAT 


GTTGGTGGAT 


TTGTAACGAA 


11160 


15 


AACAGGTGCC 


ATCGGTACTT 


TTGTTTATGG 


CTTCATCTTA 


AGATTGTTAG 


GTCCACTCGG 


11220 




TTTACACCAT 


ATTTTTTACT 


TACCGTTTTG 


GCAGACGGCA 


CTTGGTGGTA 


CTTTAGAAGT 


11280 




CAAAGGGCAC 


TTAGTTCAAG 


GTACGCAGAA 


CATCTTCTTT 


GCTCAACTTG 


GTGATCCAGA 


11340 


20 


TGTGACGAAG 


TATTATTCAG 


GTGTGTCACG 


CTTTATGTCA 


GGCCGTTTTA 


TTACGATGAT 


11400 




GTTCGGC1TA 


TGTGGTGCCG 


CACTTGCAAT 


TTATCACACA 


GCTAAACCTG 


AACATAAAAA 


11460 




AGTTGTCGGC 


GGTTTAATGT 


TATCCGCTGC 


ACTCACTTCA 


TTTTTAACAG 


GTATTACCGA 


11520 


25 


ACCTTTAGAG 


TTTAGTTTCT 


TGTTTGTCGC 


ACCTATTCTT 


TATGTAATCC 


ATGCCTTCTT 


11530 




TGATGGATTA 


GCATTTATGA 


TGGCAGACAT 


TTTCAACATT 


ACAATTGGTC 


AAACCTTCAG 


11640 




TGGAGGCTTT 


ATCGATTTCT 


TACT CTTT GG 


TGTGCTACAA 


GGTAATAGTA 


AAACAAACTA 


11700 


30 


CCTATACGTC 


ATACCTATTG 


GAATTGTGTG 


GTTCTGTTTG 


TATTACATCG 


TTTTCAGATT 


11760 




CTTAATTACG 


AAATTTAATT 


TCAAAACACC 


TGGTCGAGAA 


GATAAAGCTG 


CAGCACAACA 


11820 


35 


AGTTGAGGCT 


ACTGAAAGAG 


CACAAACTAT 


TGTTGCTGGT 


TTGGGAGGCA 


AAGATAACAT 


11880 


TGAAATCGTT 


GACTGTTGTG 


CAACGAGACT 


ACGCGTCACA 


CTTCATCAAA 


ATGACAAAGT 


11940 




CGATAAAGTA 


TTACTCGAAA 


GTACTGGTGC 


CAAAGGTGTA 


ATCCAGCAAG 


GCACTGGTGT 


12000 


40 


GCAAGTAATT 


TATGGGCCTC 


ACGTTACAGT 


TATCAAAAAT 


GAAATTGAAG 


AATTGCTCGG 


12060 




GGATTAAGAC 


TAACCGAAAT 


ATCAACAGAA 


CTAATGGCAA 


CGATGTACGA 


AGTAAGAAGT 


12120 




3ACATCGTTG 


CTTTTATTTT 


TAATGTTACA 


TTTGAAGCAT 


TAAGTTCATC 


ATGCACTGTA 


12180 


45 


GTGAGCCCGC 


AAATCGCCTC 


TGCTAGACAA 


TCATCTTAAT 


GCTATGATTA 


AAGCTTAAGT 


12240 




GCCAGATTTG 


AATTTAATTT 


CAACAACGAC 


TTTCACTACA 


TTAAAAATAG 


GGCCACTCGA 


12300 




CACATATAGT 


TGTATCAAAT 


AG CC CTTT AT 


ACAATTTTTT 


GGGTAAGGTT 


TTACAATTTT 


12360 


50 


TGGGATGGTA 


TAGATTTTAT 


AAAAAGTTAT 


TTAAGTTCTT 


CTGCTTCAGC 


CATAATATCT 


12420 




TTTAATGTTT 


TAGCTGAATG 


TGCGAACTTG 


CTTTGTTCTT 


CGTCGTTTAA 


TGGGATTTCT 


1 ?4 R0 
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TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12600 

ATCGCTTCAG TAATTCTAGC TAATCCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 1266 0 

5 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG 1272 0 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AGCGTGTGAC 12 78 0 

CATACTGGTA ATTCAGTGTC ACCATGTTCA CCAATAATTT GAGCATCGAC GCTACGTGGC 12840 

10 

GCAACATCGn AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTAC CAG 12 900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTGCATAC GCTAAAATAT 1296 0 

75 CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGCCATT ACTTCACCAA 13 020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 13 08 0 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATTCGC 13140 

20 CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 13200 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT ACCTATTAAT ACAACTTTGT 13320 

25 TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 13330 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 1350 0 

30 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13 620 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 13 680 

35 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 13740 

GGTGTGCAAA CAGACAATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13 800 

40 GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13860 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13 920 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13980 

45 ATTACGTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAACTTTT AAACATGTTT 14040 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAGCACAAGC TGTAATGGCC 14100 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

50 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14 22 0 

AAAGCCATTC AAGACGTGAC AGGATTAGAA GAAAATGACC CTGTCATTCA AGCTTGGGCA 142 80 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(l) SEQUENCE CHARACTERISTICS; 
5 (A) LENGTH; 8779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

w 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

GGTATTTTnG GAnGGGTACC TAAAGCAATT CCGGCAAAGG GTnAATCCAG GTACCGAAAT 6 0 

1S GGACTTCCCG TTATCGATAA TACCGACATA TATTGTGACA AGTAGATTTT ATGGACATTT 12 0 

AGGCTTACTT TTACTTGTGA TAATTGCATG TATGTTTACT GGTATTTACC CaTCaATACA 180 

TATCATTCAA TTATTGATAT ATGTACCGTT TTGTTTTTTC TTAACTGCCt CGGTGACGTT 240 

20 ATTAACATCA ACACTCGGTG TGTTAGTTAG AGATACACAA ATGTTAATGC AAGCAATATT 30 0 

AAGAATATTA TTTTACTTTT CACCAATTTT GTGGCTACCA AAGAACCATG GTATCAGTGG 360 

TTTAATTCAT GAAATGATGA AATATAATCC AGTTTACTTT ATTGCTGAAT CATACCGTGC 420 

25 AGCAATTTTA TATCACGAAT GGTATTTCAT GG AT CATTGG AAATTAATGT TATACAATTT 4 80 

CGGTATTGTT GCCATTTTCT TTGCAATTGG TGCGTACTTA CACATGAAAT ATAGAGATCA 540 

ATTTGCAGAC TTCTTGTAAT ATATTTATAT GACGAAACCC CGCTAACCAT TAATAAATGG 600 

30 

AAGTGGGGTT CATTTTTGTT TATAATTTAA GTAAATAACA TATTAAGTTG GTGTATTATG 6 60 

AACGTTTTAA TAAAGAAATT TTATCATTTG GTAGTTCGAA TACTTTCTAA AATGATTACG 720 

CCTCAAGTGA TTGATAAACC GCATATCGTA TTTATGATGA CTTTTCCAGA AGATATTAAG 780 

35 

CCTATCATCA AAGCATTAAA TAATTCGTCG TATCAGAAAA CTGTTTTAAC AACACCAAAA 840 

CAAGCGCCTT ATTTATCTGA ACTTAGCGAC GATGTTGATG TGATAGAAAT GACTAATCGA 900 

40 ACATTGGTAA AACAAATTAA GGCTTTGAAA AGCGCGCAGA TGATTATTAT CGATAATTAT 960 

TACCTATTGC TAGGTGGATA TAATAAGACT TCTAATCAAC ACATTGTTCA AACGTGGCAT 102 0 

GCAAGTGGTG CATTAAAAAA CTTTGGCTTA ACAGATCATC AAGTCGATGT GTCTGACAAG 10 80 

45 GCAATGGTTC AGCAGTACCG TAAAGTTTAT CAAGCGACGG ATTTTTACTT AGTGGGTTGT 114 0 

GAACAAATGT CACAATGTTT TAAACAGTCT TTAGGTGCAA CAGAAGAGCA AATGCTGTAT 1200 

TTTGGGCTTC CGAGAATTAA TAAATATTAC ACAGCTGATA GAGCAACGGT TAAGGCAGAG 1260 

so 

TTAAAGGATA AATATGGAAT TACAAATAAG TTGGTATTAT ATGTACCAAC ATATAGAGAA 132 0 

^ATAAAGCAG ATAATAGGOC TATTGATAAA ^rT^AT^T""^ n-VAA^^^ i<~~z^n~?i~ 
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ATCGACACGT CTACATTAAT GCTAATGTCA GATATAATTA TTAGCGACTA TAGTTCGCTG 150 0 

CCAATAGAAG CTAGCTTGTT AGATATTCCA ACTATATTTT ATGTGTATGA TGAAGGAACA 1560 

TATGATCAGG TGAGAGGCCT GAATCAATTT TACAAAGCAA TACCGGATAG CTACAAAGTG 1620 

TATACTGAAG AAGATTTAAT AATGACGATA CAAGAAAAAG AACATCTATT AAGTCCGTTA 16 8 0 

TTTAAAGATT GGCATAAGTA TAATACTGAT AAAAGTTTAC ATCAGCTCAC AGAATATATA 174 0 

GATAAGATGG TGACAAAATG AGGTTTACGA TAATCATACC TACATGTAAT AATGAGGCAA 180 0 

CAATTCGACA ATTGTTAATA TCTATTGAGA GTAAAGAACA CTATAGAATC CTTTGTATTG 186 0 

ATGGTGGTTC TACTGATCAA ACAATTCCTA TGATTGAACG GTTACAAAGA GAACTCAAGC 192 0 

ATATTTCATT AATACAATTA CAAAATGCTT CGATAGCTAC GTGTATTAAT AAAGGTTTGA 1980 

TGGATATCAA AATGACAGAT CCACATGATA GTGACGCATT TATGGTCATA AAACCAACAT 204 0 

CAATCGTATT GCCAGGTAAA TTAGATAGGT TAACTGCTGC TTTCAAAAAT AATGATAATA 210 0 

TTGATATGGT AATAGGGCAG CGAGCTTACA ATTACCATGG TGAATGGAAA TTGAAAAGTG 216 0 

CTGATGAGTT TATTAAAGAC AATCGAATCG TTACATTAAC GGAACAACCA GATTTGTTAT 222 0 

25 CAATGATGTC TTTTGACGGA AAGTTATTCA GTGCTAAATT TGCTGAATTA CAGTGTGaCG 22 8 0 

AAACTTTAGC TAACaCATAC AATCACGCAA TACTTGTCAA GGCGATGCAA AAAGCTACGG 2 34 0 

ATATACATTT AGTTTCACAG ATGATTGTCG GAGATAACGA TATAGATACA CATGCTACAA 24 00 

GTAACGATGA AGATTTTAAT AGATATATCA CAGAAATTAT GAAAATAAGA CAACGAGTCA 24 6 0 

TGGAAATGTT ACTATTACCT GAACAAAGGC TATTATATAG TGATATGGTT GATCGTATTT 2 52 0 

TATTCAATAA TTCATTAAAA TATTATATGA ACGAACACCC AGCAGTAACG CACACGACAA 2 580 

TTCAACTCGT AAAAGACTAT ATTATGTCTA TGCAGCATTC TGATTATGTA TCGCAAAACA 264 0 

TGTTTGACAT TATAAATACA GTTGAATTTA TTGGTGAGAA TTGGGATAGA GAAATATACG 2700 

AATTGTGGCG ACAAACATTA ATTCAAGTGG GCATTAATAG GCCGACTTAT AAAAAATTCT 276 0 

TGATACAACT TAAAGGGAGA AAGTTTGCAC ATCGAACAAA ATCAATGTTA AAACGATAAC 2 82 0 

GTGTACATTG ATGACCATAA ACTGCAATCC TATGATGTGA CAATATGAGG AGGATAACTT 2880 

AATGAAACGT GTAATAACAT ATGGCACATA TGACTTACTT CACTATGGTC ATATCGAATT 2 94 0 

GCTTCGTCGT GCAAGAGAGA TGGGCGATTA TTTAATAGTA GCATTATCAA CAGATGAATT 3 0 00 

TAATCAAATT AAACATAAAA AATCTTATTA TGATTATGAA CAACGAAAAA TGATGCTTGa 3 0 60 

50 ATCAATACGC TATGTCGATT TAGTCATTCC AGAAAAGGGC TGGGGACAAA AAGAAGACGA 3120 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT TATGGGACAT GACTGGGAAG GTGAATTCGA 3180 
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TAAAATCAAA CAAGAATTAT ATGGTAAAGA TGCTAAATAA ATTATATAGA ACTATCGATA 3 3 00 

CTAAACGATA AATTAACTTA GGTTATTATA AAATAAATAT AAAACGGACA AGTTTCGCAG 33 60 

5 CTTTATAATG TGCAACTTGT CCGTTTTTAG TATGTTTTAT TTTCTTTTTC TAAATAAACG 34 2 0 

ATTGATTATC ATATGAACAA TAAGTGCTAA TCCAGCGACA AGGCATGTAC CACCAATGAT 34 8 0 

AGTGAATAAT GGATGTTCTT CCCACATACT TTTAGCAACA GTATTTGCCT TTTGAATAAT 3 54 0 

10 TGGCTGATGA ACTTCTACAG TTGGAGGTCC ATAATCTTTA TTAATAAATT CTCTTGGATA 3 600 

GTCCGCGTGT ACTTTACCAT CTTCGACTAC AAGTTTATAA TCTTTTTTAC TAAAATCACT 36 60 

TGGTAAAACA TCGTAAAGAT CATTTTCAAC ATAATATTTC TTACCATTTA TCCTTTGCTC 3 720 

15 

ACCTTTAGAC AATATTTTTA CATATTTATA CTGATCAAAT GAGCGTTCCA TTAATGCATT 3 780 

CCCCATCATA TTACGTTGCT TCTCGCCACC AAGGTTTTTA TAGTCTCCTG CACCCATGAT 3 84 0 

AACTTGATTA ATTCTAAATT TACCTCGTTT GGTAGTAATC GTATGGTTGT AATTTGCTGT 3 900 

20 

ATCACTTGAT C C AGTTTTT A AACCATCTGT ACCCGGCAAA CTCATTTTTG CACCTTCCAA 3960 

TGAAAAGTTG AATGTGTAAT ACGTAACTGC ATGCGTTGTT GGTGCTAACT GCTTTGTAAA 4 02 0 

2$ GTCTAATATT TTAGGTGTCT CTTTAATCAC GTGTAAATCT AAAATGGCAT AGTCTCTAGC 4 080 

AGTCGTTACA GTACGTTCTT GGTCTTTATA CTTTGTTGGT GCAAATGTAC GTAATCTTGA 414 0 

ATTTTCAGCA CCCGTTGGAT TGACGAAATG TGTATTTTTC ATTCCGATAG CTTTAGCTTT 4 2 00 

30 GTTATTCATT AAATCAACGA AATCGCTGGT GTTTTTTGAA ACCTTCTTAG CTAAAATTAA 4 260 

TGCCGCGGCA TTACTAGAAT TAGATACTGT AATTTGTAAT AGGTCTGCGA TTGTCCATAC 4 320 

TTGTCCAGGA TATAGTTTCG TATTACTCAA CTCAGGTAGT GTAGACATAA TATATTCTTT 4 3 80 

35 

GTTCGTCATT GTGACTGTGT CATCAAGTGA AAGCTGCCCC TTATTTACAG CTTCCAATGT 444 0 

TAAGTACATT GTCATTAATT TAG T CAT AG A CGCTGGAtTC CACTTAGTAT CGATATTGTA 4 50 0 

TTGATACAGT AATTGTCCAG TTTGACTTAC ATTAACAGCA CTCGTCGGTT CGTATGCAGC 4 56 0 

40 

CGACAAACCT GCATAACCAT ATTGATTTGC TGCTTGTACA GGGGTTACGT CACTGTTAGT 4 62 0 

AGCTTGTGCA TATGGTGTCA TAATACTTAA TGTTAAACAT AAAATGATGA TAATAGATAT 4 680 

TAAATTTTTC ATAAAG CGTT AATCTTCCCT TTTCCAATTC TTAAATATTC CCTAAAAGCA 4 74 0 

ATGGTTATTC CTACTTACGG AAATCATTGC TAATTCACTT CACCTTAATT AAATTGTTGA 4 80 0 

AAATAAAGTT TTCTGCAGTT AATTTGAAAA ATAATGCAAA TATATTACGT GTGTAGCTAA 4 86 0 

50 AGGTGTTATA ATGTTTGTAC GAAGAGCAAA CTTACTCAAA AGCGATTAAT TTTCATGTTT 4 92 0 

TAATATAAAG ACTTTGAGAA GTTATTACAA AAAATGCAAT AGAAATATTC TAT CAT AT AA 4 9 80 
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AAGTATATGA TAGAAATGCA TGTATCTATC TAAATGAATT AACTATAAAT TTCAAACAGA 5100 

AGAGGTAAAA CTATGAAACG AGAAAATCCA TTGTTTTTCT TATTTAAAAA ACTATCATGG 516 0 

CCAGTGGGTC TTATCGTTGC AGCTATCACT ATTTCATCAC TAGGGAGCTT AAGTGGACTA 522 0 

TTAGTGCCAC TGTTTACTGG ACGAATTGTA GATAAATTTT CCgTGAGCCA TATCAATTGG 52 8 0 

AATCtAATCG CATTATTTGG TGGTATCTTT GTCATCAATG CTTTATTAAG CGGATTAGGT 534 0 

TTATATTTAT TAAGTAAAAT TGGTGAAAAG ATTATTTATG CGATACGCTC AGTTTTATGG 54 0 0 

GAGCATATCA TACAATTAAA AATGCCATTC TTTGACAAAA ATGAAAGTGG TCAATTAATG 54 6 0 

AGTCGATTAA CTGACGATAC GAAAGTGATA AATGAATTTA TTTCACAAAA GCTACCTmAC 552 0 

TTATTACCAT CAATCGTTAC ATtAGTTGGG TCACTAATCA TGTTATTTAT TTTAGATTGG 558 0 

AAAATGACAT TATTAACATT TATAACGATA CCGATATTCG TTTTaATTAT GATTCCTCTA 564 0 

GGTCGTATTA TGCAAAAGAT ATCGACAAGT ACACAATCTG AAATTGCAAA CTTCAGTGGT 57 0 0 

TTGTTAGGGC GTGTCCTAAC TGAAATGCGT CTTGTTAAAA TATCAAATAC AGAGCGTCTT 576 0 

GAATTAGATA ATGCACATAA AAATTTGAAT GAAATATATA AATTAGGTTT AAAACAGGCT 582 0 

AAAATTGCGG CAGTTGTACA ACCAATTTCA GGTATAGTTA TGTTGCTAAC AATTGCAATT 58 8 0 

ATTTTAGGTT TTGGTGCATT AGAAATTGCG ACTGGTGCAA TCACTGCAGG TACATTAATT 594 0 

GCAATGATAT TTTATGTTAT TCAGTTATCT ATGCCTTTAA TCAATCTTTC CACGTTAGTT 6000 

ACAGATTATA AAAAGGCAGT CGGTGCAAGT AGTAGAATAT ACGAAATCAT GCAAGAACCT 6 06 0 

ATTGAACCGA CAGAAGCTCT TGAAGATTCT GAAAATGTAT TAATTGATGA CGGTGTATTG 612 0 

TCATTTGAAC ATGTAGACTT TAAATATGAT GTGAAGAAAA TATTAGATGA TGTGTCGTTC 618 0 

CAAATCCCAC AAGGTCAAGT GAGTGCTTTT GTAGGCCCTT CTGGGTCTGG TAAAAGTACG 624 0 

ATATTTAATC TGATAGAACG TATGTATGAA ATTGAGTCAG GTGATATTAA ATATGGCCTT 630 0 

GAAAGTGTCT ATGATATCCC GTTATCTAAG TGGCGACGCA AAATTGGATA TGTTATGCAA 636 0 

TCAAATTCGA TGATGAGTGG TACAATTAGA GACAATATTT TATACGGAAT TAATCGTCAT 642 0 

GTTTCAGATG AAGAACTTAT TAATTATGCT AAATTAGCGA ACTGTCATGA TTTTATCATG 64 8 0 

CAATTTGATG AAGGATATGA CACGCTTGTA GGTGAACGAG GATTGAAACT GTCTGGCGGA 654 0 

CAACGTCAAC GTATTGATAT TGCTAGAAGT TTTGTTAAAA ATCCTGATAT TTTGTTACTT 6 6 00 

GATGAAGCAA CAGCTAATCT CGATAGTGAA AGTGAATTGA AAATTCAAGA AGCTTTAGAA 6660 

ACATTGATGG AAGGTAGAAC AACGATTGTC ATTGCGCATC GTTTGTCTAC AATTAAAAAA 6 72 0 

GCCGGTCAAA TTATATTCTT AGACAAAGGA CAGGTAACAG GTAAAGGTAC GCATTCAGAA 67 8 0 
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TTTTATATAT ATAAGTAAGC TTGGAGCAAA TACACATATA CCATCGAGGA AATTAAAGTG 6 90 0 

TGGCACATTG ATGGATATAG ATGTTAATAA ATTGCTTCAA GCTTTTGTCT ATTTTAAATC 6 96 0 

ATTTGAGAAG TTACGACATA ATAATTCTTA AATTAATGAA ATCGATATTT TAAGAAAAAA 7 02 0 

ATGCTCATGG TATAATACAA GTTATAAGCA AACATACATA TATTAAATAC TGTAGCCACG 7 08 0 

AG T CAT AATT CTTCATATTT TACATAGCAA T7TAACTGAT TTTAGAGTCC ACGGTACAGA 714 0 

AGTTTGATAT TTCAATGTTT CTAAATTTTT AAAAAATTAA ATCATAGGTG GGTGCCAAAT 7200 

GTTTTTATTA ATCAACATTA TTGGTCTAAT TGTATTTCTT GGTATTG CGG TATTATTTTC 7260 

AAGAGATCGC AAAAATATCC AATGGCAATC AATTGGGATC TTAGTTGTTT TAAACCTGTT 732 0 

TTTAGCATGG TTCTTTATTT ATTTTGATTG GGGTCAAAAA GCAGTAAGAG GAGCAGCCAA 738 0 

TGGTATCGCT TGGGTAGTTC AGTCAGCGCA TGCTGGTACA GGTTTTGCAT TTGCAAGTTT 7440 

GACAAATGTT AAAATGATGG ATATGGCTGT TGCAGCCTTA TTCCCAATAT TATTAATAGT 7500 

GCCATTATTT GATATCTTAA TGTACTTTAA TATTTTACCG AAAATTATTG GAGGTATTGG 7560 

TTGGTTACTA GCTAAAGTAA CAAGACAACC TAAATTCGAG TCATTCTTTG GGATAGAAAT 7620 

GATGTTCTTA GGAAATACTG AAGCATTAGC CGTATCAAGT GAGCAACTAA AACGTATGAA 7680 

TGAAATGCGT GTATTAACAA TCGCAATGAT GTCAATGAGC TCTGTATCGG GAGCTATTGT 774 0 

AGGTGCGTAT GTACAAATGG TACCAGGAGA ACTGGTACTA ACGGCAATTC CACTAAATAT 7 800 

CGTTAACGCG ATTATTGTGT CATGCTTGTT GAATCCAGTA AGTGTTGAAG AGAAAGAAGA 7 860 

TATTATTTAC AGTCTTAAAA ACAATGAAGT TGAACGTCAA CCATTCTTCT CATTCCTTGG 7 920 

AGATTCTGTA TTAGCAGCAG GTAAATTAGT ATTAATCATC ATCGCATTTG TTATTAGTTT 7 9 80 

TGTAGCGTTA GCTGATCTAT TTGATCGTTT TATCAATTTG ATTACAGGAT TGATAGCAGG 8040 

ATGOATAGGC ATAAAAGGTA GTTTCGGTTT AAACCAAATT TTAGGTGTGT TTATGTATCC 8100 

ATTTGCGCTA TTACTCGGTT TACCTTATGA TGAAGCGTGG TTGGTAGCAC AACAAATGGC 8160 

TAAGAAAATT GTTACAAATG AATTTGTTGT TATGGGTGAA ATTTCTAAAG ATATTGCATC 82 20 

TTATACACCA CACCATCGTG CGGTTATTAC AACATTCTTA ATTTCATTTG CAAACTTCTC 82 30 

AACGATTGGT ATGATTATCG GTACATTGAA AGGCATTGTT GATAAAAAGA CATCAGACTT 83 4 0 

TGTATCTAAA TATGTACCTA TGATGCTATT ATCAGGTATC CTAGTTTCAT TATTAACAGC 84 0 0 

AGCTTTCGTT GGTTTATTTG CATGGTAATA TGTCGAAGAG TGACTATGAT AATACATTTT 84 6 0 

AACTAATAAA TATGTCCAGG CATGTCGTCT ATTGATATAG GTGAGATGCT TGGACTTTTT 852 0 

TATTATTGAT ATAAAGGTAT nTAAATATTT TTAAAGTTAC CGAAATTGAA G CATT AT AAA 858 0 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 8 7 00 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 87 6 0 

5 ATGAAAGTAA ATTAAAAAT 8 77 9 

(2) INFORMATION FOR SEQ ID NO: 59: 

<i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 31096 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



15 

<X1) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 





GTTGCAGTAG 


TCAAAGAATT 


AAACAAGGTG 


AAGGcGTGTA 


GCTTGCACAC 


CCGAAAATGT 


60 


20 


GCGTAAGTTA 


aCGGATGCAG 


GACATAAAGT 


AATTGTTGAA 


AAAAATGCTG 


GCATTGGTTC 


120 




AGGATTTTCT 


AACGATATGT 


ATGAAAAAGA 


AGGCGCTAAG 


ATCGTAACTC 


ACGAACAAGC 


180 




ATGGGAAGCT 


GATCTTGTTA 


TCAAAGTAAA 


AGAACCTCAT 


GAAAGCGAAT 


ATCAATATTT 


240 


25 


CAAAAAGAAT 


CAAATTATCT 


GGGGATTTTT 


ACATCTAGCA 


TCTTCAAAAG 


AAATAGTAGA 


300 




AAAAATGCAA 


GAAGTTGGTG 


TAACTGCGAT 


TAGTGGTGAA 


ACCATTATAA 


AAAATGGAAA 


360 




AGCAGAATTA 


TTAGCGCCAA 


TGAGTG CT AT 


AGCAGGTCAA 


CGCTCAGCAA 


TTATGGGAGC 


420 


30 


TTACTACTCT 


GAAGCACAAC 


ATGGTGGTCA 


AGGTACTTTA 


GTGACTGGTG 


TACATGAAAA 


480 




TGTGGATATA 


CCTGGTAGTA 


CATATGTGAT 


TTTCGGTGGT 


GGAGTAGCAG 


CAACAAATGC 


540 


35 


AGCAAATGTT 


GCCTTGGGAC 


TAAATGCTAA 


AGTAATCATT 


ATCGAGTTAA 


ACGATGACCG 


600 


CATTAAATAT 


CTTGAAGATA 


TGTATGCAGA 


AAAAGATGTC 


ACAGTAGTCA 


AATCAACACC 


660 




AGA^AATTTA GCAGAACAAA 


TTAAGAAAGC 


AGATGTATTT 


ATTTCTACAA 


TTTTAATTTC 


720 


40 


AGGTGCGAAA 


CCGCCAAAAT 


TGGTTACTCG 


TGAGATGGTT 


AAATCAATGA 


AAAAAGGTTC 


780 




AGTATTAATC 


GATATAGCTA 


TTGACCAAGG 


TGGAACTATT 


GAAACAATTA 


GACCAACTAC 


840 




AATTTCTGAT 


CCAGTGTATG 


AAGAAGAAGG 


TGTGATTCAT 


TATGGTGTAC 


CAAATCAACC 


900 


45 


AGGAGCAGTC 


CCAAGAACTT 


CAACAATGGC 


ATTAGCACAA 


GGAAATATTG 


ATTATATATT 


960 




AGAAATTTGT 


GACAAAGGCT 


TAGAACAAGC 


AATTAAAGAT 


AATGAAGCCT 


TAAGTACTGG 


1020 




TGTAAACATT 


TACCAAGGAC 


AAGTGACAAA 


TCAAGGATTA 


GCTTCATCAC 


ATGACCTAGA 


1080 


50 


TTATAAAGAA 


ATATTAAATG 


TTATCGAATA 


GATAGTAATT 


TAAATGAAAT 


TGAGTGAAAT 


1140 




GAATATTTTA 


AATAT AG CAT 


TATAGTTTGG 


ACTAAAAATT 


TACAAAACGG 


AAGGATGTAA 


1200 
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TCGAAGAAGC TAAAGCAAGC ATTAAACCAT TTATTCGTCG AACACCTCTA ATTAAATCAA 132 0 

TGTATTTAAG CCAAAGTATA ACTAAAGGGA ATGTATTTCT AAAATTAGAA AATATGCAAT 13 8 0 

TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATnAAAA TTAATCACTT AACAGATGAA 14 4 0 

CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGgAAC CATGCACAAG GTGTTGCTTT 1500 

AACAGCTAAA TTATTAC-GCA TTGATGCAAC GATTGTAATG CCTGAAACAG CACCACAAGC 156 0 

GAAACAACAA GCAACAAAAG GCTATGGGGC AAAGGTTATT TTAAAAGGTA AAAACTTTAA 162 0 

CGAAACTAGA CTTTATATGG AAGAATTAGC GAAAGAAAAT GGCATGACAA TCGTTCATCC 16 80 

ATATGACGAT AAGTTTGTAA TGGCAGGCCA AGGAACAATT GGTTTAGAAA TTTTAGATGA 174 0 

TATTTGGAAT GTGAATACAG TCATCGTACC AGTTGGCGGT GGAGGATTAA TTGCAGGTAT 18 00 

TGCCACCGCA TTAAAATCAT TTAACCCTTC AATTCATATT ATCGGTGTTC AATCTGAGAA I860 

TGTTCATGGT ATGGCTGAGT CTTTCTATAA GAGAGATTTA ACTGAACATC GAGTGGATAG 1920 

CACAATAGCA GATGGTTGTG ATGTAAAAGT TCCTGGTGAA CAAACATATG AAGTAGTTAA 198 0 

ACATTTAGTA GATGAATTTA TTCTTGTTAC TGAAGAAGAA ATTGAACATG CTATGAAAGA 204 0 

TTTAATGCAG CGTGCCAAAA TTATTACTGA AGGTGCAGGC GCATTACCAA CAGCTGCAAT 2100 

TTTAAGTGGA AAAATAAACA ATAAATGGCT TGAAGATAAA AATGTTGTTG CATTAGTTTC 2160 

AGGCGGGAAT GTTGACTTAA CTAGAGTTTC AGGTGTCATT GAACATGGAC TGAATATTGC 2220 

AGATACAAGC AAGGGTGTGG TAGGTTAAAA CATTTAATCT TAAAAATGAG GTGTAATTAT 22 8 0 

GTCAAATGGT AAAGAATTAC AAAAAAATAT AGGTTTCTTC TCAGCGTTTG CTATTGTTAT 234 0 

GGGGACAGTT ATTGGTTCAG GAGTATTCTT TAAAATATCA AACGTAACAG AAGTAACAGG 24 00 

AACAGCAGGA ATGGCCTTGT TTGTATGGTT CCTAGGCGGC ATCATTACCA TTTGTGCGGG 24 6 0 

GTTAACAGCA GCAGAACTTG CTGCTGCAAT CCCTGAAACA GGTGGCTTAA CGAAGTATAT 2520 

AGAATATACA TACGGTGATT TCTGGGGCTT CCTATCAGGT TGGGCGCAAT CATTTATTTA 2580 

TTTTCCAGCT AACGTAGCAG CATTGTCTAT CGTATTTGCG ACACAGCTAA TTAATTTATT 2 64 0 

CCATTTATCT ATAGGTTCGT TAATACCAAT AGCAATCGCA TCTGCGTTAT CTATTGTGTT 2 7 00 

GATAAATTTC CTAGGTTCAA AAGCAGGCGG AATTTTACAA TCAGTTACTT TAGTAATTAA 2 76C 

ACTGATTCCA ATCATCGTTA TTGTAATTTT TGGTATTTTT CAATCTGGAG ATATCACTTT 2 82 0 

IT CATTAATT CCAACTACAG GTAATTCaGG AAATGGCTTC TTTACAGCAA TTGGTAGTGG 2 88 0 

ITT ATT AG CA ACTATGTTTG CATATGATGG TTGGATTCAT GTAGGAAATG TTGCGGGGGA 2 94 0 

ACTTAAAAAT CCTAAACGCG ATTTACCTTT AGCGATTTCA GTTGGTATCG GTTGTATTAT 3 000 
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TGGTAATTTA 


AATGCAGCTT 


CAGATACATC 


AAAAATATTA 


TTTGGTGAAA 


ATGGCGGTAA 


3120 




GATTATTACA 


ATCGGTATAT 


TAATTTCTGT 


TTATGGTACG 


ATCAATGGCT 


ATACTATGAC 


3180 


5 


TGGTATGCGC 


GTACCATATG 


CAATGGCTGA 


AAGAAAATTA 


TTGCCATTTA 


GCCATTTATT 


3240 




CGCAAAATTA 


ACAAAATCTG 


GCGCACCATG 


GTTTGGCGCA 


ATTATACAAC 


TTATAATCGC 


3300 


10 


TATCATCATG 


ATGTCAATGG 


GAGCATTTGA 


TACAATTACA 


AATATGTTAA 


TCTTTGTTAT 


3360 


TTGGTTGTTC 


TATTGTATGT 


CATTTGTTGC 


GGTAATAATT 


TTAAGAAAAC 


GTGAACCAAA 


3420 




TATGGAACGA 


CCATATAAAG 


TACCGTTATA 


TCCGATCATA 


CCTTTAATTG 


CTATTTTGGC 


3480 


15 


AGGATCATTT 


GTATTAATTA 


ATACACTGTT 


TACACAATTT 


ATATTAGCAA 


TCATTGGAAT 


3540 




TCTAATAACA 


GCACTTGGTA 


TACCAGTTTA 


TTACTATAAA 


AAGAAACAAA 


AAGCAGCATA 


3600 




AGGTAAGATA 


ACTAGCATTG 


AGAATAAATG 


GATGGACTAC 


TAATAAATTT 


AAAGTTTTAC 


3660 


20 


ACATTAAAAT 


CAAAAACCAT 


TCAATTATTC 


TATGGAACAG 


ACAAATTTCT 


GTTATGGAAT 


3720 




TTGTCTGTTT 


TTCAAAAGTA 


TAGGGAGGCA 


AATAGAGATG 


GAAAAGCCGT 


CAAGAGAGGC 


3780 




ATTTGAAGGC 


AATAATAAGT 


TGTTAATAGG 


AATTGTTCTA 


AGTGTAATAA 


CGTTTTGGCT 


3840 


25 


ATTTGCACAA 


TCATTGGTTA 


ATGTTGTACC 


AATACTTGAA 


GATAGTTTCA 


ATACAGATAT 


3900 




TGGAACGGTT 


AATATCGCCG 


TTAGTATAAC 


TGCTTTATTT 


TCAGGAATGT 


TTGTAGTAGG 


3960 




AGCAGGTGGT 


CTTGCTGATA 


AATATGGCAG 


AATTAAACTC 


ACGAACATTG 


GTATTATCTT 


4020 


30 


AAATATATTA 


GGTTCATTAT 


TAATCATTAT 


TTCAAATATT 


CCTTTATTAC 


TTATTATAGG 


4080 




AAGATTAATT 


CAAGGACTTT 


CAGCAGCATG 


TATTATGCCT 


GCAACTTTGT 


CTATTATTAA 


4140 




GTCATATTAC 


ATTGGGAAAG 


ATAGACAACG 


CGCTTTAAGT 


TATTGGTCAA 


TTGGCTCATG 


4200 


35 


GGGCGGCTCT 


GGTGTTTGTT 


CATTTTTTGG 


AGGTGCAGTT 


GCAACGCTTT 


TAGGTTGGCG 


4260 




TTGGATTTTC 


ATCCTATCAA 


TTATAATTTC 


ATTAATTGCA 


CTGTTTCTTA 


TTAAAGGCAC 


4320 


40 


ACCTGAAACT 


AAATCTAAAT 


CGATTTCTCT 


AAATAAATTT 


GACATTAAAG 


GTCTGGTTCT 


4380 


TTTAGTCATT 


ATGCTCCTCA 


GTTTAAATAT 


TTTAATTACT 


AAAGGATCAG 


AATTAGGTGT 


4440 




AACCTCACTT 


CTTTTTATTA 


CTTTATTAGC 


TATTGCAATT 


GGATCTTTTA 


GTTTATTTAT 


4500 


45 


AGTTCTTGAA 


AAGCGTGCTA 


CAAATCCTTT 


AATCGATTTT 


AAATTATTTA 


AAAATAAAGC 


4560 




TTACA CAGGT 


GCAACAGCTT 


CAAACTTTTT 


GTTAAATGGT 


GTTGCAGGAA 


CATTAATAGT 


4620 




AGCCAACACA 


TTTGTTCAAA 


GAGGTTTAGG 


ATATTCTTCA 


TTGCAAGCAG 


GAAGTTTATC 


4680 


50 


AATCACTTAT 


TTAGTAATGG 


TACTAATTAT 


GATTCGTGTT 


GGTGAAAAGT 


TACTTCAAAC 


4740 




ACTCGGATGC 


AAGAAACCAA 


TGTTAATTGG 


AACAGGAGTT 


CTTATTGTCG 


GAGAATGTCT 


4800 
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is 



20 



ATTCTTTGGT TTAGGACTAG GGATATATGC TACACCATCA ACAGATACAG CAATTGCAAA 4 92 0 

TGCACCGTTA GAAAAAGTAG GCGTTGCTGC AGGTATCTAT AAAATGGCTT CTGCATTAGG 4 9 80 

TGGAGCATTT GGCGTCGCAT TGAGTGGTGC AGTATATGCA ATCGTATCAA ATATGaCAAA 504 0 

CATTTATACA GGTGcAATGa TTGnCATTAT GGTTaAATGC AGGTATGGGa ATATTATCaT 510 0 

TCGTTATCAT TTTGtTACTT GTGcCTAAAC mAAACGACAC TCAATTATGA TAATTGAGAA 516 0 

TTAAATTGAA ATCATACAAG TCGCTACAAT ATTAAACAAA AATATAAACC GATTCTTATG 522 0 

TGTCATTATT TTAAATGAAC ATAGGGATTG GTTTTTTATT ACTCTTTTAC GCTACTTTAT 5280 

TTATAATTAT TATAAATTGT CACAAATTCA ATTTACCTTA CAATATATTT TGTGTTATTA 534 0 

TATTCTGGAG CATAAATAAA TTGTTCAACA CATAGTTGTA ATGTGTTTCA ATACTTTTTG 54 00 

GATAGATTGC GAAATTGTAT TGAATCGTCA TCGTTTTAAA TTTTTAAATG AGAATGGAAT 54 6 0 

GAGCATTACA ATACACAAGC AATCAAAAGT AAATACATTC ACAACACAAC AGAGACATAA 5520 

CAACAAGATA AGGAGTGAAC AATAGCTGTG AATTATCGTG ATAAAATTCA AAAGTTTAGT 5580 

ATTCGTAAAT ATACAGTTGG TACATTTTCA ACTGTCATTG CGACATTGGT ATTTTTAGGA 564 0 

25 TTCAATACAT CACAAGCACA TGCTGCTGAA ACAAATCAAC CAGCAAGCGT GGTTAAACAG 5700 

AAACAACAAA GTAATAATGA ACAGACTGAG AATCGAGAAT CTCAAGTACA AAATTCTCAA 5760 

AATTCACAAA ATGGTCAATC ATTATCTGCT ACTCATGAAA ATGAGCAACC AAATATTAGT 5820 

CAAGCTAATT TAGTAGATCA AAAAGTAGCG CAATCATCTA CTACTAATGA TGAACAACCA 5880 

GCATCTCAAA ATGTAAATAC AAAGAAAGAT TCGGCAACGG CTGCGACAAC ACAACCAGAT 594 0 

AAAGAACAAA GTAAGCATAA ACAAAACGAA AGTCAATCTG CTAATAAAAA TGGAAACGAC 6000 

AATAGAGCGG CTCATGTAGA AAATCATGAA GCAAATGTAG TAACAGCTTC AG ATT CAT CT 6 060 

GATAATGGTA ACGTACAACA TGACCGAAAT GAATTACAAG CGTTTTTTGA TGCAAATTAT 6120 

CATGATTATC GCTTTATTGA CC GTGAAAAT GCAGATTCTG GCACATTTAA CTATGTAAAA 6180 

GGCATTTTTG ATAAGATTAA TACGTTATTA GGCAGTAATG ATCCAATAAA CAATAAAGAC 624 0 

TTGCAACTTG CATACAAAGA ATTGGAACAA GCTGTTGCTT TAATTCGTAC AATGCCTCAA 6 3 00 

45 CGTCAACAGA CTAGCCGACG TTCAAATAGA ATTCAAACGC GTTCGGTTGA GTCAAGAGCT 636 0 

GCAGAGCCTA GATCAGTATC AGACTATCAA AATGCAAATT CAT CAT ATT A TGTTGAAAAT 64 2 0 

GCTAATGATG GTTCGGGCTA TCCTGTTGGT ACATATATCa ATGCTTCTAG TAAAGGGGCG 64 8 0 

50 CCATATAATT TACCAACTAC ACCATGGAAT ACATTGAAGG CCTCTGACTC AAAGGAAATT 6 54 0 

GCTCTTA7GA C AG CG AAA CA AACTGGAGAC GGGTACCAAT GGGTTATTAA GTTTAATAAA 6 6 00 



30 



35 



40 
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GTAGGAAGAA 


CTGACTTTGT 


AACAGTTAAT 


TCAGATGGAA 


CAAATGTACA 


ATGGAGTCAT 


6720 




GGAGCAGGAG 


CAGGTGCAAA 


TAAACCACTT 


CAACAAATGT 


GGGAATATGG 


AGTAAATGAT 


6780 


5 


CCTCATCGTT 


CACATGACTT 


TAAAATAAGA 


AATAGAAGTG 


GCCAAGTAAT 


ATATGACTGG 


6840 




CCAACTGTCC 


ATATTTATTC 


TTTAGAAGAT 


TTATCTAGAG 


CGAGTGATTA 


TTTTAGTGAA 


6900 




GCTGGAGCGA 


CACCTGCTAC 


TAAAGCTTTT 


GGTAGACAAA 


ATTTTGAATA 


TATTAATGGT 


6960 


10 


CAAAAACCTG 


CTGAATCACC 


GGGTGTTCCT 


AAAGTTTATA 


CTTTCATCGG 


TCAAGGTGAT 


7O20 




GCAAGTTATA 


CAATTTCATT 


TAAAACACAA 


GGTCCAACTG 


TTAATAAATT 


GTACTATGCA 


7080 


15 


GCAGGTGGGC 


GTGCTTTAGA 


GTACAATCAA 


TTATTTATGT 


ACAGTCAACT 


ATACGTCGAA 


7140 


TCAACGCAAG 


ACCATCAACA 


ACGTCTTAAT 


GGTTTAAGAC 


AAGTGGTTAA 


TCGTACATAT 


7200 




CGCATAGGTA 


CAACTAAACG 


TGTAGAAGTG 


AGTCAAGGAA ATGTACAAAC 


GAAAAAGGTA 


7260 


20 


TTAGAAAGTA 


CAAACCTAAA 


TATAGATGAT 


TTTGTTGATG 


ATCCTTTAAG 


TTATGTTAAG 


7320 




ACGCCGAGTA 


ATAAAGTGTT 


AGGATTTTAT 


TCGAATAATG 


CAAATACTAA 


TG CTTTT AG A 


7380 




CCGGGTGGAG 


CCCAACAATT 


AAATGAATAT 


CAATTAAGTC 


AATTATTTAC 


TGATCAAAAA 


7440 


25 


TTACAAGAAG 


CAGCAAGAAC 


TAGAAACCCA 


ATAAGATTAA 


TGATTGGTTT 


CGACTATCCT 


7500 




GATGCTTATG 


GTAATAGTGA 


AcTTTAGTTC 


CTGTTAACTT 


AACGGTATTA 


CCTGAAATCC 


7560 




AACATAATAt 


TaAATTCTTT 


AAAAATGACG 


ATACTCAAAA 


TATTGCTGAA 


AAACCATTTT 


7620 


30 


CAAAACAAGC 


TGGGCATCCA 


GTTTTCTATG 


TATATGCAGG 


TAACCAAGGG 


AATGCTTCCG 


7680 




TGAATTTAGG 


TGGTAGCGTA 


ACATCTATTC 


AACCATTACG 


TATTAATTTA 


ACAAGTAATG 


7740 




AGAATTTTAC 


AGATAAAGAT 


TGGCAAATTA 


CAGGTATTCC 


GCGTACATTA 


CACATTGAAA 


7800 


35 


ACTCGACAAA 


TAGACCTAAT 


AATGCCAGAG 


AACGCAATAT 


TGAACTTGTT 


GGTAACTTAT 


7860 




TACC£GGGGA 


TTACTTTGGA 


ACGATACGTT 


TTGGACGTAA AGAACAATTA 


TTCGAAATTC 


7920 


40 


GTGTTAAACC 


ACATACACCA 


ACAATTACAA 


CGACAGCTGA 


GCAATTAAGA 


GGTACAGCAT 


7980 


TACAAAAAGT 


GCCTGTTAAT 


ATTTCGGGAA 


TACCGTTGGA 


TCCATCGGCA 


TTGGTTTATT 


8040 




TAGTTGCACC 


AACAAATCAA 


ACTACGAATG 


GTGGTAGTGA 


GGCAGATCAA 


ATACCATCTG 


8100 


45 


GTTATACGAT 


ACTTGCGACT 


GGTACACCTG 


ATGGGGTGCA 


TAATACAATT 


ACTATACGAC 


8160 




CGCAAGATTA 


TGTTGTATTC 


ATACCACCTG 


TAGGTAAACA 


AATTAGAGCA 


GTAGTTTATT 


8220 




ATAATAAAGT 


AGTTGCATCT 


AATATGAGTA 


ATGCTGTTAC 


TATTTTGCCA 


GATGACATTC 


8280 


50 


CACCAACAAT 


CAATAATCCT 


GTTGGAATAA 


ATGCCAAATA 


CTATCGAGGC 


GACGAAkCAA 


8340 




CTTTACAATG 


GGTGTCTCTG 


ATAGACATTC 


TGGTATAAAA 


AATACAACTA 


TTACGACATT 


8400 
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TACAGGTAGA GTGAGTATGA ATCAGGCATT 
GACAGaCAAT GTCAATAATA CGACAAATGA 
5 AGGTAAAATT AGTGAAGATG CTCATCCGAT 

AGTCAATCCG ACTGCTGTAT CTAATGATGA 
TAAAAACCAA AATATAAGAG GATATTTAGC 

10 

TGGTAATGTC ACATTACATT ACCGTGATGG 
GATGACATAC GAACCAGTTG TGAAACCTGA 
AACGGTAACG ATTGCTAAAG GACAATCATT 

is 

TTTAAGTAAT GGACAACCTA TTCCAAGTGG 
TATTCCAACT GCACAAGAAG TTAGTCAAAT 

OQ TGCTACAAAT GCGTATCATA AAGATAGTGA 

TGTGAAACAA CCAGAAGGCG ATCAACGTGT 
TGATGAAATC TCAAAAGTAA AACAAGCATT 

25 TGCCGAAGGT GATATTTCAG TTACAAATAC 

AGTAAATATT AATAAAGGTC GATTAACGAA 
TTTCTTGCGT TGGGTTAATT TCCCACAAGA 

30 TGCAAACAGA CCAACAGATG GTGGTTTATC 

TCGTTATGAT GCTACATTAG GTACTCAAAT 
AGCAACAACT ACAGTGCCTG GATTGCGAAA 

35 

AGAAGCTGGC GGAAGACCTA ACTTTAGAAC 
TGATGGTCAA CGTCAATTTA CGTTGAATGG 
CCCTTCAAAC GGTTATGGTG GGCAACCTGT 

40 

TAACTCAACT GTTGTTAACG TAAACGAACC 
TGACCACGTT GTAAAAAGTA ATTCTACACA 
GTTATACTTA ACGCCATATG GTCCAAAACA 
AAATACTACT GACGCTATTA ACATTTATTT 
TTCAGTAGGT AATTACACTA ATCATCAAGT 
50 TACAGCGAAT GATAACTTTG GTGTGCAATC 

AGGTACTGTT GATAATAACC ATCAACATGT 



TAACAGTGAT ATTACATTTA AAGTGTCAGC 8 52 0 

TAGTCAATCT AAACATGTTT CAATTCATGT 8 58 0 

TGTATTAGGA AATACTGAGA AAGTTGTAGT 864 0 

AAAGCAAAGC ATAATTACTG CCTTTATGAA 87 00 

ATCAACTGAT CCAGTAACTG TCGATAATAA 8760 

CTCATCGACA ACGCTTGATG CTACAAATGT 8 82 0 

ATACCAAACT GTCAATGCTG CTAAAACAGC 8880 

TAGTATTGGT GATATTAAAC AATATTTTAC 8 94 0 

CACATTTACA AATATTACAT CTGATAGAAC 9000 

GAACGCAGGC ACGCAGTTAT ACCATATAAC 9060 

AGACTTCTAT ATTAGTTTGA AAATCATCGA 9120 

ATATCGTACA TCAACATATG ATTTAACTAC 91 BO 

TATTAATGCA AATAGAGATG TAATTACGCT 924 0 

ACCTAATGGT GCTAATGTAA GTACTATTAC 9300 

ATCATTCGCG TCAAACCTAG CTAATATGAA 93 6 0 

TTATACAGTG ACATGGACGA ATGCAAAAAT 942 0 

ATGGTCTGAT GACCATAAAT CTTTAATTTA 94 80 

TACGACGAAT GATATTTTAA CAATGTTAAA 954 0 

TAACATTACT GGTAATGAAA AATCACAAGC 9600 

GACTGGTTAT TCACAATCAA ATGCGACAAC 96 6 0 

TCAAGTGATT CAAGTGTTAG ACATCATCAA 972 0 

TACAAATTCA AATACTCGTG CAAACCATAG 97 8 0 

GGCAGCTAAT GGTGcTGGCG CATTTACAAT 984 0 

TAATGCAAGT GATGCAGTTT ATAAAGCACA 9 900 

ATATGTTGAA CATTTAAATC AAAATACAGG 996 0 

TGTACCAAGT GACTTAGTGA ATCCAACAAT 1002 0 

GTTCTCAGGT GAAACATTTA CAAATACTAT 10080 

TGTAACTGTA CCAAATACAT CACAAATTAC 1014 0 

TTCTGCAACG GCACCAAATG TGACATCAGC 10200 
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GTTCAATGTA 


ACAGTGAAAC 


CTTTGCGTGA 

\- x x xuuy X 




vjx lUfjIAL I I 


LA 1 uHMLLAjL 


10320 




TGCTAATCCT 


GTGAGAATTG 


CCAATATTTP 




AL-ALj I A 1 UAL. 


AAVj L. 1 LiA 1 l_-A 


10380 


5 


AACGACAATT 


ATTAATTCGT 


T AACGTTT A P 


TGAAAPAfVTA 


C UvV\ x Avj AA 


PTT & TP2P A a p 


10440 




AGCAAGTGCG 


AATGAAATCA 


CTAGTAAAAP 


A GTTAOTA A T 


\J 1 L-ALj I 1 A 


ptpp a a a t a a 

L. 1 oLi AAA x AA 


n r\ c n r\ 
1 Ob 0 0 




TGCCAATGTg 


CACAGTAACT 


GTTACTTATC 


AAGATGGAA^ 


AAP ATP A A P A 


ptp. a p to t a p 


1 n c c n 
1 U b fa U 


10 


CTGTAAAGCA 


TGTCATTCCA 


GAAATCGTTG 


CAC ATT CO P A 


TTAPAPTYZTA 


p a a rzp. pp a a p 


lUbzU 




ACTTCCCAGC 


AGGTAATGGT 


TCTAGTGCAT 


PAG ATT A PTT 


Xxirvj 1 1 Al l« 1 


AA 1 X Ao lb 


1 neon 


15 


ACATTGCAGA 


TGCAACTATT 

*** * ** X ** X X 


ACATGGGTAA 


GT£V5APAAf2P 


CICO A a fiTl R A 


(jAxAAIACAC 


1 074 0 


GTATTGGTGA 


AGATATAACT 


GTAACTGCAC 


ATATfTTAAT 

X rt i L 1 XAA X 


TPaTnnpp a a 


AL-AALvjL-LLtA 






TTACGAAAAC 


AGCAACATAT 


AAAGTAGTAA 

****** VJ X **\»J a 


GAAPTGTAPP 


GazvapaTpyrp 


tttt a a a r* a p 

1 1 ILjAAAL-Ao 


J. U b b (J 


20 


CCAGAGGTGT 


TTTATACCCA 


GGTGTTTCAG 


ATATGTATGA 


X uLunnnLnn 


1 A 1 L7 1 1 AAvjL. 


1 Pi Q O Pi 

1 u y z u 




CAGTAAATAA 


TTCTTGGTCG 


ACAAATGCGC 


AACATATGAA 


TTTCCAATTT 




1 Pi Q O Pi 

1U y oU 




ATGGTCCTAA 


CAAAGATGTT 


GTAGG CATAT 


CTACTCGTCT 


TATTAGAGTG 


AL-A1 AHjAI A 


1 1 U4 U 


25 


ATAGACAAAC 


AGAAGATTTA 


ACTATTTTAT 

**■ »^ X J» J. X X X *» x 


CTAAAGTTAA ACCTGACCCA 


V^L-iAoAAl lu 


1 1 1 Art 

1 1 1 u u 




ACGCAAACTC 


TGTGACATAT 


AAAG CAGGT C 


TTACAAACCA 


AGAAATTAAA 


vjX IAAIAALCj 


11160 




TATTAAATAA 


CTCGTCAGTA 


AAATTATTTA 

****** X X *» X X J. *t> 


AAGCAGATAA 


TACACCATTA 


a aTrTrara a 
AA X U 1 LALftA 


i i Z Z U 


30 


ATATTACTCA 


TGGTAGCGGT 


TTTAGTTCGG 

■*- X X li\J X X VWw 


TTGTGACAGT 


AAGTGACGCG 




1 1 O O Pi 

JL 1 Z O U 




GCGGAATTAA 


AGCAAAATCT 

*T*.^J ^ <**fcJ*** X Vm* X 


TCAATTTPAA 


TGAACAATGT 


GACGTATACG 


ALoLAAbALb 


1 1 ^ >1 A 

1134 0 




AACATGGTCA 


AGTTGTTACA 


GTAAPAAGAA 


ATGAATCTGT 


TGATTCAAAT 


L7ALAL7 1 vjLAcI 


1 14 U U 


35 


CAGTAACAGT 


GACACGACAA 


TTACAAGCAA 


CTACTGAAGG 


CGCTGTATTT 


att a a Arv^TT' 

A X X nnnuo X w 


X 1 *± b u 




GCGAffGGTTT 


TGATTTCGGA 


CACGTAGAAA 


GATTTATTCA 


AAACCCGCCA 


pATnnnrjrAA 

\-J\ X \jVjVAjL-AA 


1 1 con 


40 


CGGTTGCATG 


GCATGATAGT 


CCAGATACAT 


GGAAGAATAC 


AGTCGGTAAC 


aPTraTaaaa 


1 1 b 0 U 


CTGCGGTTGT 


AACATTACCT 


AATGGTCAAG 


GTACGCGTAA 


TGTTGAAGTT 


CCAGTCAAAG 


1 1 C A Pt 

1 X b *t U 




TTTATCCAGT 


TGCTAATGCA 


AAGGCGCCAT 


CACGTGATGT 


GAAAGGTCAA 


AATTTGACTA 


1 1 7nn 
XX / ou 


45 


ATGGAACGGA 


TGCGATGAAC 


TACATTACAT 


TTGATCCAAA 


TACAAACACA AATGGTATCA 


X X / b U 




CTGCAGCATG 


GGCAAATAGA 


CAACAACGAA 


ATAACCAACA 


AGCAGGCGTG 


CAACATTTAA 


1 1 0 O P 




ATGTCGATGT 


CACATATCCA 


GGTATTTCAG 


CTGCTAAACG 


AGTTCCTGTT 


ACTGTTAATG 


11880 


50 


TATATCAATT 


TGAATTCCCT 


CAAACTACTT 


ATACGACAAC 


GGTTGGAGGC 


ACTTTAG CAA 


11940 




GTGGTACGCA 


AGCATCAGGA 


TATGCACATA 


TGCAAAATGC 


TACTGGTTTA 


CCAACAGATG 


12000 
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TGAATAAACC GAATGTGGCT AAAGTCGTTA ACGCAAAATA TGACGTCATC TATAACGGAC 12120 

ATACTTTTGC AACATCTTTA CCAGCGAAAT TTGTAGTAAA AGATGTGCAA CCAGCGAAAC 12180 

CAACTGTGAC TGAAACAGCG GCAGGAGCGA TTACAATTGC ACCTGGAGCA AACCAAACAG 12240 

TGAATACACA TGCCGGTAAC GTAACGACAT ACGCTGATAA ATTAGTTATT AAACGTAATG 12300 

GTAACGTTGT GACGACATTT ACACGTCGCA ATAATACGAG TCCATGGGTG AAAGAAGCAT 123 6 0 

CTGCAGCAAC TGTAGCAGGT ATTGCTGGAA CTAATAATGG TATTACTGTT GCAGCAGGTA 12420 

CTTTCAACCC TGCTGATACA ATTCAAGTTG TTGCAACGCA AGGAAGCGGA GAGACAGTGA 12480 

GTGATGAGCA ACGTAGTGAT GATTTCACAG TTGTCGCACC ACAACCGAAC CAAGCGACTA 12540 

CTAAGATTTG GCAAAATGGT CATATTGATA TGACGCCTAA TAATCCATCA GGACATTTAA 12600 

TTAATCCAAC TCAAGCAATG GATATTGCTT ACACTGAAAA AGTGGGTAAT GGTGCAGAAC 12660 

ATAGTAAGAC AATTAATGTT GTTCGTGGTC AAAATAATCA ATGGACAATT GCGAATAAGC 12720 

CTGACTATGT AACGTTAGAT GCACAAACTG GTAAAGTGAC GTTCAATGCC AATACTATAA 12780 

AACCAAATTC ATCAATCACA ATTACTCCGA AAGCAGGTAC AGGTCACTCA GTAAGTAGTA 12840 

25 ATCCAAGTAC ATTAACTGCA CCGGCAGCTC ATACTGTCAA CACAACTGAA ATTGTGAAAG 12900 

ATTATGGTTC AAATGTAACA GCAGCTGAAA TTAACAATGC AGTTCaAGTT GCTAATAAAC 12 960 

GTACTGCAAC GATTAAAAAT GGCACAGCAA TGCCTACTAA TTTAGCTGGT GGTAGCACAA 13 020 

CGACGATTCC TGTGACAGTA ACTTACAATG ATGGTAGTAC TGAAGAAGTA CAAGAGTCCA 13 0 BO 

TTTTCACAAA AGCGGATAAA CGTGAGTTAA TCACAGCTAA AAATCATTTA GATGATCCAG 1314 0 

TAAGCACTGA AGGTAAAAAG CCAGGTACAA TTACGCAGTA CAATAATGCA ATGCATAATG 13200 

CGCAACAACA AATCAATACT GCGAAAACAG AAGCACAACA AGTGATTAAT AATGAGCGTG 13 260 

CAACACCACA ACAAGTTTCT GACGCACTAA CTAAAGTTCG TGCAGCACAA ACTAAGATTG 13320 

ATCAAGCTAA AGCATTACTT CAAAATAAAG AAGATAATAG CCAATTAGTA ACGTCTAAAA 13 3 80 

ATAACTTACA AAGTTCTGTG AACCAAGTAC CATCAACTGC TGGTATGACG CAACAAAGTA 13440 

TTGATAACTA TAATGCGAAG AAGCGTGAAG CAGAAACTGA AATAACTGCA GCTCAACGTG 13 500 

TTATTGACAA TGGCGATGCA ACTGCACAAC AAATTTCAGA TGAAAAACAT CGTGTCGATA 13 560 

ACGCATTAAC AGCATTAAAC CAAGCGAAAC ATGATTTAAC TGCAGATACA CATGCCTTAG 13 620 

AGCAAGCAGT GCAACAATTG AATCGCACAG GTACAACGAC TGGTAAGAAG CCGGCAAGTA 13680 

50 TTACTGCTTA CAATAATTCG ATTCGTGCAC TTCAAAGTGA CTTAACAAGT GCTAAAAATA 13 740 

GCGCTAATGC TATTATTCAA AAGCCAATAA GAACAGTACA AGAAGTGCAA TCTGCGTTAA 13800 
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CTGATAATAG TGCTTTAAAA ACTGCTAAGA 
TAACTACTGA TGGTATGACA CAATCATCAA 
5 GTCAAACAGA ATCAACAAAT GCACAAAATG 

AAATTGCCGC AGAAAAAACA AAAGTAGAAG 
CTGGATTAAC TCCAGACTTG GCACCATTAC 

10 

TTGATCAGCC AACGAGTACG ACTGGTATGA 
AACTTTCAGC AGCTAGAACT AAAATTCAAG 
ATGTTGCGAC AATACGTCAA AACGTGACAG 

15 

AAGCACGTAA TGGCTTAACA GTCGATAAAG 
AACATAGTAT TGACACGCAA ACAAGTACAA 

20 ACAATGCGAA GTTAACAGCT GCACGTAATA 

GTTCACCGAC TGTAGAACAA ATTAATACAA 
ATTTAGATCA TGCACGTCAA GCTTTAACAC 

25 CGCAATTAGA ACAAAGCATT AATCAACCAA 

TAAATGCGTA CAACCAAAAA TTACAAGCAG 
TGTTGAATGG CAACCCAACT GTCCAAAATA 

in 

CTAAGGATCA ATTAAATACA GCACGTCAAG 
CAACATTACA TGGTGCATCT AACTTAAACC 
TTAATGCTGC TCAAAATcAT GctGCGCTTG 

35 

ATACTGCGAT GACGAAATTA AAAGACAGTG 
AAAATTACAC TGACGCAACA CCAGCTAATA 
CTAAAGGTGT CATTGGAGAA ACGACTAATC 

40 

AAGCAGCATC TGTTAAATCG ACGAAAGATG 
CGAAAACAGA AGCAACAAAT GCGATTACGC 

45 ATGCATTAAC ACAACAAGTG AATAGTGcAC 

AAACGACTCA AAGCTTAAAT ACTGCTATGA 
ACCAAGTCGT ACAAAGTGAT AATTATGTCA 

50 ACAATGCATA CAACCATGCG AATGACATTA 

CACCAAGTGA TGTTAACAAT GCTTTATCAA 



CGAAACTTGA TGAAGAAATC AATAAATCAG 13 920 

TCCAAGCATA TGAAAATGCT AAACGTGCGG 13 980 

TTATTAACAA TGGTGATGCG ACTGACCAAC 14 040 

AAAAATATAA TAGCTTAAAA CAAGCAATTG 14100 

AAACTGCAAA AACTCAGTTG CAAAATGATA 14160 

CAAGCGCATC TATTGCAGCA TTTAATGAAA 1422 0 

AAATTGATCG TGTATTAGCC TCACATCCAG 14 2 80 

CAGCGAATGC CGCTAAATCA GCACTTGATC 14 34 0 

CGCCTTTAGA AAATGCGAAA AATCAACTAC 144 0 0 

CTGGTATGAC ACAAGACTCT ATAAATGCAT 144 6 0 

AGATTCAACA AATCAATCAA GTATTAGCAG 14520 

ATACGTCTAC AGCAAATCAA GCTAAATCTG 145 BO 

CAGATAAAGC GCCGCTTCAA ACTGCGAAAA 1464 0 

CGGATACAAC AGGTATGACG ACCGCTTCGT 14 70 0 

CGCGTCAAAA GTTAACTGAA ATTAATCAAG 14760 

TCAATGATAA AGTGACAGAG GCAAACCAAG 14 82 0 

GTTTAACATT AGATAGACAG CCAGCGTTAA 14 88 0 

AAGCACAACA AAATAATTTC ACGCAACAAA 14 94 0 

AAACAATTAA GTCTAACATT ACGGCTTTAA 15000 

TTGCGGATAA TAATACAATT AAAT CAGATC 1506 0 

AACAAGCGTA TGATAATGCA GTTAATGCGG 15120 

CAACGATGGA TGTTAACACA GTGAACCAAA 15180 

CTTTAGATGG TCAACAAAAC TTACAACGTG 15240 

ATGCAAGTGA TTTAAACCAA G CACAAAAG A 15300 

AAAACGTGCA AGCAGTAAAT GATATTAAAC 153 60 

CAGGTTTAAA ACGTGGCGTT GCTAATCATA 15420 

ACGCAGATAC TAATAAGAAA AATGATTACA 15480 

TTAATGGTAA TGCACAACAT CCAGTTATAA 15540 

ATGTCACAAG TAAAGAACAT GCATTGAATG 15600 
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ATTTAAATAA TGCACAACGT CAAAACTTAC AATCGCAAAT TAATGGTGCG CATCAAATTG 15720 

ATGCAGTTAA TACAATTAAG CAAAATGCAA CAAACTTGAA TAGTGCAATG GGTAACTTAA 157 8 0 

GACAAGCTGT TGCAGATAAA GATCAAGTGA AACGTACAGA AGATTATGCG GATGCAGATA 15840 

CAGCTAAACA AAA TG CAT AT AACAGTGCAG TTTCAAGTGC CGAAACAATC ATTAATCAAA 15 900 

CAACAAATCC AACGATGTCT GTTGATGATG TTAATCGTGC AACTTCAGCT GTTACTTCTA 15 960 

ATAAAAATGC ATTAAATGGT TATGAAAAAT TAGCACAATC TAAAACAGAT GCTGCAAGAG 16 020 

CAATTGATGC ATTACCACAT TTAAATAATG CACAAAAAGC AGATGTTAAA TCTAAAATTA 16 080 

ATGCTGCATC AAATATTGCT GGCGTAAATA CTGTTAAACA ACAAGGTACA GATTTAAATA 1614 0 

CAXCGATGGg TAACTTGCAA GGTGCAATCA ATGATGAACA AACGACGCTT AATAGTCAAA 16200 

ACTATCAAGA TGCGACACCT AGTAAGAAAA CAGCATACAC AAATGCGGTA CAAGCTGCGA 16260 

OQ AAGATATTTT AAATAAATCA AATGGTCAAA ATAAAACGAA AGATCAAGTT ACTGAAGCGA 1632 0 

TGAATCAAGT GAATTCTGCT AAAAATAACT TAGATGGTAC GCGTTTATTA GATCAAGCGA 163 80 

nCAAaCAGCA AAACAGCAGT TAAATAATAT GACGCATTTA ACAACTGCAC AAAAAACGAA 16440 

25 TTTAACAAAC CAAATTAATA GTGGTACTAC TGTCGCTGGT GTTCAAACGG TTCAATCAAA 16 500 

TGCCAATACA TTAGATCAAG CCATGAATAC GTTAAGACAA AGTATTGCCA ACAAAGATGC 16560 

GACTAAAGCA AGTGAAGATT ACGTAGATGC TAATAATGAT AAGCAAACAG CATATAACAA 16620 

30 CGCAGTAGCT GCTGCTGAAA CGATTATTAA TGCTAATAGT AATCCAGAAA TGAATCCAAG 16 680 

TACGATTACA CAAAAAGCAG AGCAAGTGAA TAGTTCTAAA ACGGCACTTA ACGGTGATGA 16 740 

AAACTTAGCT GCTGCAAAAC AAAATGCGAA AACGTACTTA AACACATTGA CAAGTATTAC 16 800 

AGATGCTCAA AAGAACAATT TGATTAGTCA AATTACTAGT GCGACAAGAG TGAGTGGTGT 1686 0 

TGAXACTGTA AAACAAAATG CGCAACATCT AGACCAAGCT ATGGCTAGCT TACAGAATGG 16 920 

TATTAACAAC GAATCTCAAG TGAAATCATC TGAGAAATAT CGTGATGCTG ATACAAATAA 16 980 

ACAACAAGAG TATGATAATG CTATTACTGC AGCGAAAGCG ATTTTAAATA AATCGACAGG 17 040 

TCCAAACACT GCGCAAAATG CAGTTGAAGC AGCATTACAA CGTGTTAATA ATGCGAAAGA 1710 0 

TGCATTGAAT GGTGATGCAA AATTAATTGC AGCTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TACTTTAACG CATATCACTA CAGCTCAACG TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

TACAAACTTA GCTGGTGTTG AATCTGTTAA ACAAAATGCG AATAGTTTAG ATGGTGCTAT 17280 

50 GGGTAACTTA CAAACGGCTA TCAACGATAA GTCAGGAACA TTAGCGAGCC AAAACTTCTT 1734 0 

GGATGCTGAT GAGCAAAAAC GTAATGCATA CAATCAAGCT GTATCAGCAG CCGAAACCAT 17400 
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TGTTAATAAT GCGAAACATG CATTAAATGG TACGCAAAAC TTAAACAATG CGAAACAAGC 1752 0 

AGCGATTACA GCAATCAATG GCGCATCTGA TTTAAATCAA AAACAAAAAG ATGCATTAAA 17530 

5 AGCACAAGCT AATGGTGCTC AACGCGTATC TAATGCACAA GATGTACAGC ACAATGCGAC 1764 0 

TGAACTGAAC ACGGCAATGG GCACATTAAA ACATGCCATC GCAGATAAGA CGAATACGTT 17700 

AGCAAGCAGT AAATATGTTA ATGCCGATAG CACTAAACAA AATGCTTACA CAACTAAAGT 1776 0 

10 

TACCAATGCT GAACATATTA TTAGCGGTAC GCCAACGGTT GTTACGACAC CTTCAGAAGT 17820 

AACAGCTGCA GCTAATCAAG TAAACAGCGC GAAACAAGAA TTAAATGGTG ACGAAAGATT 1788 0 

ACGTGAAGCA AAACAAAACG CCAATACTGC TATTGATGCA TTAACACAAT TAAATACACC 17 940 

75 

TCAAAAAGCT AAATTAAAAG AACAAGTGGG ACAAGCCAAT AGATTAGAAG ACGTACAAAC 19000 

TGTTCAAACA AATGGACAAG CATTGAACAA TGCAATGAAA GGCTTAAGAG ATAGTATTGC 1806 0 

2Q TAACGAAACA ACAGTCAAAA CAAGTCAAAA CTATACAGAC GCAAGTCCGA ATAACCAATC 1812 0 

AACATATAAT AGCGCTGTGT CAAATGCGAA AGGTATCATT AATCAAACTA ACAATCCGAC 1818 0 

TATGGATACT AGTGCGATTA CCCAAGCTAC AACACAAGTG AATAATGCTA AAAATGGTTT 18 240 

25 AAACGGTGCT GAAAACTTAA GAAATGCACA AAACACTGCT AAGCAAAACT TAAATACATT 183 00 

ATCACACTTA ACAAATAACC AAAAATCTGC CATCTCATCA CAAATTGATC GTGCAGGTCA 18360 

TGTGAGTGAG GTAACTGCTA CTAAAAATGC AGCAACTGAG TTGAATACGC AAATGGGTAA 18420 

30 CTTGGAACAA GCTATCCATG ATCAAAACAC AGTTAAACAA AGTGTTAAAT TTACTGATGC 18480 

AGATAAAGCT AAACGTGATG CGTATACAAA TGCGGTAAGC AGAGCTGAAG CAATTCTGAA 18540 

TAAAACGCAA GGTGCAAATA CGTCTAAACA AGATGTTGAA GCGGCTATTC AAAATGTTTC 13600 

35 

AAGTGCTAAA AATGCATTGA ATGGTGATCA AAACGTTACA AATGCGAAGA ATG CAGCTAA 18660 

AAATGCATTA AATAACTTAA CGTCAATTAA TAATGCACAA AAACGTGACT TAACAACTAA 18720 

AATTGATCAA GCAACAACTG TAGCTGGTGT TGAAGCTGTA TCTAATACGA GTACACAATT 18780 

40 

GAAtACAGCG ATGGCTAACT TGCAAAATGG TATTAATGAT AAAACAAATA CACTAGCAAG 18840 

TGAAAACTAT CATGATGCTG ATTCAGATAA GAAAACTGCT TATACTCAAG CCGTTACGAA 18 9 00 

CGCAGAAAAT ATTTTAAATA AAAATAGTGG ATCAAATTTA GACAAAACTG CCGTTGAAAA 18 960 

45 

CGCGTTGTCA CAAGTTGCTA ATGCGAAAGG TGCCCTAAAT GGTAACCATA ATTTAGAGCA 1902 0 

AGCTAAATCA AATGCAAACA CTACTATAAA CGGACTTCAA CATTTAACAA CTGCTCAAAA 19080 

50 AGATAAATTG AAACAACAAG TGCAACAAGC ACAAAATGTT GCAGGTGTAG ATACTGTTAA 1914 0 

ATCAAGTGCC AACACATTAA ATGGTGCTAT GGGTACGTTA AGAAATAGCA TACAAGATAA 19200 
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TAACAATGCT GTTGATAGTG CTAATGGTGT CATTAATGCA ACAAGCAATC CAAATATGGA 19320 

TGCTAATGCA ATTAACCAAA TCGCTACACA AGTGACATCA ACGAAAAATG CATTAGATGG 19380 

TACACATAAT TTAACGCAAG CGAAACAAAC AGCAACAAAT GCCATCGATG GTGCTACTAA 19440 

CTTAAATAAA GCGCAAAAAG ATGCGTTAAA AGCACAAGTT ACAAGTGCGC AACGTGTTGC 19500 

AAATGTAACA AGTATCCAAC AAACTGCAAA TGAACTTAAT ACAGCTATGG GTCAATTACA 19560 

ACATGGTATT GATGATGAAA ATGCAACAAA ACAAACTCAA AAATATCGTG ACGcTGAACA 1962 0 

AAGTAAGAAA ACTGCTTATG ATCAAGCTGT AGCTGCTGCG AAAGCAATTT TAAATAAACA 19680 

AACAGGTTCA AATTCAGATA AAGCAGCAGT TGACCGTGCA TTACAACAAG TAACAAGTAC 19740 

GAAAGATGCA TTGAATGGTG ATGCAAAACT GGCAGAAGCG AAAGCGGCAG CTAAACAAAA 19800 

CTTAGGCACT TTAAACCATA TTACGAATGC ACAACGTACT GACTTAGAAG GCCAAATCAA 19860 

TCAAGCGACG ACTGTTGATG GCGTTAATAC TGTAAAAACA AATGCCAATA CATTAGACGG 19920 

CGCAATGAAT AGCTTACAAG GTTCAATCAA TGATAAAGAT GCGACATTAA GAAATCAAAA 19980 

TTATCTTGAT GCGGATGAAT CAAAACGAAA TGCATATACG CAAGCTGTCA CAGCGGCTGA 20040 

25 AGGCATTTTA AATAAACAAA CTGGTGGTAA CACATCTAAA GCAGACGTTG ATAATGCATT 20100 

AAATGCAGTT ACAAGAGCGA AAGcGgCTTT AAATGGTGCT GACAACTTAA GAAATGCGAA 2016 0 

AACTTCAGCA ACAAATACGA TTGATGGTTT ACCTAACTTA ACACAATTAC AAAAAGACAA 20220 

30 CTTGAAGCAT CAAGTTGAaC AAGCGCAAAA TGTAGCAGGT GTAAATGGTG TTAAAGATAA 20280 

AGGTAATACG TTAAATACTG CCATGGGTGC ATTACGTACA AGTATCCAAA ATGATAATAC 20 340 

GACGAAAACA AGTCAAAATT ATCTTGATGC ATCTGACAGC AACAAAAATA ATTACAATAC 20400 

TGCTGTAAAT AATGCAAATG GTGTTATTAA TGCAACGAAC AATCCAAATA TGGATGCTAA 20460 

TGCGATTAAT GGCATGGCAA ATCAAGTCAA TACAACAAAA GCAG CGTTAA ATGGTGCACA 2 0 520 

AAACTTAGCT CAAGCTAAAA CAAATGCGAC GAACACAATT AACAACGCAC ATGACTTAAA 20 580 

CCAAAAACAA AAAGATGCAT TAAAAACACA AGTTAACAAT GCACAACGTG TATcTGATGC 2 0 640 

AAATAACGTT CAACACACTG CAACTGAATT GAACAGTGCG ATGACAGCAC TTAAAGCAGC 2 0700 

TATTGCTGAT AAAGAAAGAA CAAAAGCAAG CGGTAATTAT GTCAATGCTG ATCAAGAAAA 2 0760 

ACGTCAAGCG TATGATTCAA AAGTGACTAA CGCTGAAAAT ATCATTAGTG GTACACCGAA 20820 

TGCGACATTA ACAGTCAATG ACGTAAATAG TGCGGCATCA CAAGTCAATG CGGCTAAAAC 2 0880 

AGCATTAAAT GGTGATAACA ACTTACGTGT AGCGAAAGAG CATGCCAACA ATACAATTGA 2 0 940 

CGGCTTAGCA CAATTGAATA ATGCACAAAA AGCAAAATTA AAAGAACAAG TTCAAAGTGC 21000 
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GAAAGGCTTA AGAGATAGTA 


TTGCGAATGA 


AGCAACAATT 


AAAGCAGGTC 


AAAACTACAC 


21120 




TGACGCAAGT CCAAATAATC 


GTAACGAGTA 


CGACAGTGCA 


GTTACTGCAG 


CAAAAGCAAT 


21180 


5 


CATTAATCAA ACATCGAACC 


CAACGATGGA 


ACCAAATACT 


ATTACGCAAG 


TAACATCACA 


21240 




AGTGACAACT AAAGAACAGG 


CATTAAATGG 


TGCGCGAAAC 


TTAGCTCAAG 


CTAAGACAAC 


21300 




TGCGAAAAAC AACTTGAATA 


ACTTAACATC 


AATTAACAAT 


GCACAAAAAG 


ATGCGTTAAC 


21360 


10 


GCGTAgcATT GATGGTGCAA CAACAGTAGC 


TGGTGTAAAT 


CAAGAAACTG 


CAAAAGCAAC 


21420 




AGAATTAAAT AACGCAATGC 


ATAGTTTACA 


AAATGGTATC 


AATGATGAGA 


CACAAACAAA 


21480 


15 


ACAAACTCAG AAATACCTAG ATGCAGAGCC AAGTAAGAAA TCAGCTTATG 


ATCAAGCAGT 


21540 


AAATGCAGCG AAAGCAATTT 


TAACAAAAGC 


TAGTGGTCAA 


AATGTAGACA 


AAGCAGCAGT 


21600 




TGAACAAGCA TTGCAAAATG 


TGAACAGTAC 


GAAGACGGCG 


TTGAACGGTG 


ATGCGAAATT 


21660 


20 


AAATGAAGCT AAAGCAGCTG 


CGAAACAAAC 


GTTAGGTACA 


TTAACACACA 


TTAATAATGC 


21720 




ACAACGTACA GCGTTAGACA 


ATGAAATTAC 


ACAAGCAACA 


AATGTTGAAG 


GTGTTAATAC 


21730 




AGTTAAAGCC AAAGCGCAAC 


AATTAGATGG 


TGCTATGGGT 


CAATTAGAAA 


CATCAATTCG 


21840 


25 


TGATAAAGAC ACGACGTTAC 


AAAGTCAAAA 


TTATCAAGAT 


GCTGATGATG 


CTAAACGAAC 


21900 




TGCTTATTCT CAAGCAGTAA 


ATGCAGCAGC 


AACTATTTTA 


AATAAAACAg CTGGCGGTAA 


21960 




TACACCTAAA GCAGATGTTG 


AAAGAGCAAT 


GCAAGCTGTT 


ACACAAGCAA 


ATACTGcATT 


22020 


30 


AAACGGTATT CAmAACTTAG 


ATCGTGCGAA 


ACArGCTGCT 


AACACAGCGA 


TTACAAATGC 


22080 




TTCGGACTTA AATACAAAAC 


mAAAAGAAGC 


ATTAAAAgCA CAAGTAACAA GTGCAGGACG 


22140 




TGTATCTGCA GCAAATGGTG 


TTGAACATAC 


TGCGACTGAA 


TTAAATACTG 


CGATGACAGC 


22200 


35 


TTTAAAGCGT GCCATTGCTG 


ATAAAGCTGA 


GACAAAAGCT 


AGTGGTAACT 


ATGTCAATGC 


22260 




TGATCJCGAAT AAACGTCAAG 


CATATGATGA 


AAAAGTTACA 


GCTGCCGAAA 


ATATCGTTAG 


22320 


40 


TGGTACACCA ACACCAACGT 


TAACACCAGC 


AGATGTTACA 


AATGCAGCAA 


CGCAAGTAAC 


22380 


GAATGCTAAG ACGCAGTTAA 


ACGGTAATCA 


TAATTTAGAA 


GTAGCGAAAC 


AAAATGCTAA 


22440 




CACTGCAATT GATGGTTTAA 


CTTCTTTAAA 


TGGTCCGCAA 


AAAGCAAAAC 


TTAAAGAACA 


22500 


45 


AGTGGGTCAA GCGACGACGT 


TGCCAAATGT 


TCAAACTGTT 


CGTGATAATG 


CACAAACATT 


22560 




AAACACTGCA ATGAAAGGTC 


TACGAGATAG 


CATTGCGAAT 


GAAGCAACGA 


TTAAAGCAGG 


22620 




TCAAAACTAC ACAGATGCAA 


GTCAAAACAA 


ACAAACTGAC 


TACAACAGTG 


CAGTCACTGC 


22680 


50 


AGCAAAAGCA ATCATTGGTC 


AAACAACTAG 


TCCATCAATG 


AATGCGCAAG 


AAATTAATCA 


22740 




AGCGAAAGAC CAAGTGACAG 


CTAAACAACA 


AGCGTTAAAC 


GGTCAAGAAA 


ACTTAAGAAC 


22800 
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AGATGCAGTG 


AAACGTCAAA 


TCGAAGGTGC 


AACGCATGTT 


AATGAAGTAA 


CACAAGCACA 


22920 




AAATAATGCG 


GATGCaTTAA 


ATACAGCTAT 


GACGAACTTG 


AAAAATGGTA 


TTCAAGATCA 


22980 




GAATACGATT 


AAGCAAGGTG 


TTAACTTCAC 


TGATGCCGAC 


GAAGCGAAAC 


GTAATGCATA 


23040 




TACAAATGCA 


GTGACGCAAG 


CTGAACAAAT 


TTTAAATAAA 


GCACAAGGTC 


CAAATACTTC 


23100 




AAAAGACGGT 


GTCGAAACTG 


CGTTAGAaAA 


TGTACAACGT 


GCTAAAAACG 


AATTGAACGG 


23160 


10 


TAATCAAAAT 


GTTGCGAACG 


CTAAGACAAC 


TGCGAAAAAT 


GCATTGAATA ACCTAACATC 


23220 




AATTAATAAT 


GCACAAAAAG 


AAGCATTGAA 


ATCACAAATT 


GAAGGTGCGA 




23280 


15 


AGGTGTAAAT 


CAAGTGTCTA 


CAACGGCATC 


TGAATTAAAT 


ACAGCAATGA 


ULAAL. 1 1 AlA 


23340 


AAATGGTATT 


AATGATGAAG 


CAGCTACAAA 


AGCAGCGCTT 


AATGGTACTC 


AAAACCTTGA 


23400 




AAAAGCTAAA 


CAACACGCAA 


ATACAGCAAT 


TGACGGTTTA 


AGCCATTTAA 


LAAA IvjCALLA 


23460 


20 


AAAAGAGGCA 


TTAAAACAAT 


TGGTACAACA 


ATCGACTACT 


GTTGCAGAAG 


CAC-AALKa 1AA 


23520 




TGAGCAAAAA 


GCAAACAATG 


TTGATGCAGC 


AATGGACAAA 


TTACGTCAAA 


C*T 21 TTPP Zk H 
ulnl iijLA\jA 


23580 




TAATGCGACA 


ACAAAACAAA 


ACCAAAATTA 


TACTGATGCA AGTCAGAATA 


AnnAuuAlVjL 


23640 


25 


GTACAATAAT 


GCTGTCACAA 


CTGCACAAGG 


TATTATTGAT 


CAAACTACAA 


\j 1 LLAAL 111 


23700 




AGATCCGACT 


GTTATCAATC 


AAGCTGCTGG 


ACAAGTAAGC 


ACAACTAAAA 


rt.X'j^rlx X AAA 


23760 




TGGTAATGAA 


AACCTAGAGG 


CAGCGAAACA 


ACAAGCGTCA 


CAATCATTAG 


\j l l V— rt 1 X rvjM 


23820 


30 


TAACTTAAAT 


AATGCGCAAA 


AACAAACAGT 


TACTGATCAA 


ATTAATGGCG 




23880 




TGATGAAGCA 


AATCAAATTA 


AGCAAAATGC 


GCAAAACTTA 


AATACAGCGA 




23940 




GAAACAAGCG 


ATAGcTGACA 


AAGATGCTAC 


GAAAGCGACA 


GTTAACTTCA 




24000 


35 


TCAAGCAAAA 


CAACAAGCAT 


ATAACaCTGC 


TGTTACAAAT 


GCTGAAAATA 


TCATTTPAAA 


24060 




AGCTAATGGC 


GGCAATGCAA 


CACAAGCTGA 


AGTTGAACAA 


GCAATCAAAC 


AAGTTAATGC 


24120 


40 


TGCAAAACAA 


GCATTAAATG 


GTAATGCCAA 


CGTTCAACAT 


GCAAAAGACG 


AAGCAACAGC 


24180 


ATTAATTAAT 


AGCTCTAATG 


ACCTTAACCA 


AGCACAAAAA 


GACGCATTAA 


AACAACAAGT 


24240 




TCAAAATGCA 


ACTACTGTAG 


CTGGTGTAAA 


CAATGTTAAA 


CAAACAGCAC 


AAGAGTTAAA 


24300 


45 


CAATGCTATG 


ACACAATTAA 


AACAAGGCAT 


TGCAGATAAA 


GAACAAACAA 


AAGCTGATGG 


24360 




TAACTTTGTC 


AATGCAGATC 


CTGATAAGCA 


AAA TG CAT AT 


AATCAAGCAG 


TAGCGAAAGC 


24420 




TGAAGCATTA 


ATTAGTGctA 


CGCCTGATGT 


TGTCGTTACA 


CCTAGCGAAA 


TTACTGCAGC 


24480 


50 


GTTAAATAAA 


GTTACGCAAG 


CTAAAAATGA 


TTTAAATGGT 


AATACAAACT 


TAGCAACGGC 


24540 




GAAACAAAAT 


GTTCAACATG 


CTATTGATCA 


ATTGCCAAAC 


TTAAACCAAG 


CGCAACGTGA 


24600 
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AGCGGCGACA 


ACGCTTAATG 


ACGCGATGAC 


ACAATTGAAA 


CAAGGTATTG 


CGAATAAAGC 


24720 


ACAAATTAAA 


GGTAGCGAGA 


ACTATCACGA 


TGCTGATACT 


GACAAGCAAA 


CAGCATATGA 


24780 


TAATGCAGTA 


ACAAAAGCAG 


AAGAATTGTT 


AAAACAAACA 


ACAAATCCAA 


CAATGGATCC 


24840 


AAATACAATT 


CAACAAGCAT 


TAACTAAAGT 


GAATGACACA 


AATCAAGCAC 


TTAACGGTAA 


24900 


TCAAAAATTA 


GCTGATGCCA 


AACAAGATGC 


TAAGACAACA 


CTTGGTACAC 


TAGATCATTT 


24960 


AAATGATGCT 


CAAAAACAAG 


CGCTAACAAC 


TCAAGTTGAA 


CAAGCACCAG 


ATATTGCAAC 


25020 


AGTTAATAAT 


GTTAAGCAAA 


ATGCTCAAAA 


TCTGAATAAT 


GCTATGACTA 


ACTTAAACAA 


25080 


TGCATTACAA 


GATAAAACTG 


AGACATTAAA 


TAGCATTAAC 


TTTACTGATG 


CAGATCAAGC 


25140 


TAAGAAAGAT 


GCTTATACTA 


ATGCGGTTTC 


ACATGCAGAA 


GGTATTTTAT 


CTAAAGCAAA 


25200 


TGGCAGCAAT 


GCAAGTCAAA 


CTGAAGTGGA 


ACAAGCGATG 


CAACGTGTGA 


ACGAAGCGAA 


25260 


ACAAGCATTG 


AATGGTAATG 


ACAATGTACA 


ACGTGCAAAA 


GATGCAGCGA 


AACAAGTGAT 


25320 


TACAAATGCA AATGATTTAA ATCAAGCAAT 


GACACAATTG 


AAACAAGGTA 


TTGCAGATAA 


25380 


AGACCAAACT 


AAAGCAAATG 


GTAACTTTGT 


CAATGCTGAT 


ACTGATAAGC 


AAAATGCTTA 


25440 


CAACAATGCG 


GTAGCACATG 


CTGAACAAAT 


AATTAGTGGT 


ACACCAAATG 


CAAACGTGGA 


25500 


TCCACAACAA 


GTGGCTCAAG 


CGTTACAACA 


AGTGAATCaA 


GCTAAGGGTG 


ATTTAAACGG 


25560 


TAACCATAAC 


TTACAAGTTG 


CTAAAGACAA 


TGCAAATACA 


GCCATTGATC 


AGTTACCAAA 


25620 


CTTAAATCAA 


CCACAAAAAA 


CAGCATTAAA 


AGACCAAGTG 


TCGCATGCAG 


AACTTGTTAC 


25680 


AGGTGTTAAT 


GCTATTAAGC 


AAAATGCTGA 


TGCGTTAAAT 


AATGcAATGG 


GTACATTGAA 


25740 


ACAACAAATT 


CAAGCGAACA 


GTCAAGTACC 


ACAGTCAGTT 


GACTTTACAC 


AAGCGGATCA 


25800 


AGACAAACAA 


CAAGCATATA 


ACAATGCGGC 


TAACCAAGCG 


CAACAAATCG 


CAAATGGCAT 


25860 


ACCAACACCT GTATTGACGC 


CTGATACAGT 


AACACAAGCA 


GTGACAACTA 


TGAATCAAGC 


25920 


GAAAGATGCA 


TTAAACGGTG 


ATGAAAAATT 


AGCACAAGCG 


AAACAAGAAG 


CTTTAGCAAA 


25980 


TCTTGATACG 


TTACGCGATT 


TAAATCAACC 


ACAACGTGAT 


GCATTACGTA 


ACCAAATCAA 


26040 


TCAAGCACAA 


GCGTTAGCTA 


CAGTTGAACA 


AACTAAACAA 


AATGCACAAA 


ATGTGAATAC 


26100 


aGCaATGAGT 


AACTTGAAAC 


aAGGTATTGC 


aAACAAAGAT 


ACTGTCAAAG 


CAAGTGAGAA 


26160 


CTATCATGAT 


GCTGATGCCG 


ATAAGCAAAC 


AGCATATACA 


AATGCAGTGT 


CTCAAGCGGA 


26220 


AGGTATTATC 


AATCAAACGA 


CAAATCCAAC 


GCTTAACCCA 


GATGAAATAA 


CACGTGCATT 


26280 


AACTCAAGTG 


ACTGATGCTA 


AAAATGGCTT 


AAACGGTGAA 


GCTAAATTGG 


CAACTGAAAA 


26340 


GCAAAATGCT 


AAAGATGCCG 


TAAGTGGGAT 


GACGCATTTA 


AACGATGCTC 


AAAAACAAGC 


26400 
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AGCAACGAGC CTAGATCAAG CAATGGATCA ATTATCACAA GCTATTAATG ATAAAGCTCA 26520 

AACATTAGCG GACGGTAATT ACTTAAATGC AGATCCTGAC AAACAAAATG CGTATAAACA 26580 

GGCAGTAGCA AAAGCTGAAG CATTATTGAA TAAACAAAGT GGTACTAATG AAGTACAAGC 2 6 640 

ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA ATGGTAATGA 26 700 

CAATTTGGCA AATGCAAAAC AACAAGCAAA ACAACAATTG GCGAACTTAA CACACTTAAA 26760 

TGATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCGCCACTTG TTACAGATGT 26820 

CACTACGATT AATCAAAAAG CACAAACGTT AGATCATGCG ATGGAATTAT TAAGAAATAG 26 88 0 

TGTTGCGGAT AATCAAACGA CATTAGCGTC TGAAGATTAT CATGATGCAA CTGCGCAAAG 26940 

ACAAAATGAC TATAACCAAG CTGTAACAGC TGCTAATAAT ATAATTAATC AAACTACATC 27000 

GCCTACGATG AATCCAGATG ATGTTAATGG TGCAACGACA CAAGTGAATA ATACGAAAGT 27 060 

TGCATTAGAT GGTGATGAAA ACCTTGCAGC AGCTAAACAA CAAGCAAACA ACAGACTTGA 2 712 0 

TCAATTAGAT CATTTGAATA ATGCGCAAAA GCAACAGTTA CAATCACAAA TTACGCAATC 27180 

ATCTGATATT GCTGCAGTTA ATGGTCACAA ACAAACAGCA GAATCTTTAA ATACTGCGAT 27240 

GGGTAACTTA ATTAATGCGA TTGCAGATCA TCAAGCCGTT GAACAACGTG GTAACTTCAT 27300 

CAATGCTGAT ACTGATAAAC AAACTGCTTA TAATACAGCG GTAAATGAAG CAGCAGCAAT 27360 

GATTAACAAA CAAACTGGTC AAAATGCGAA CCAAACAGAA GTAGAACAAG CTATTACTAA 2 7420 

AGTTCAAACA ACACTTCAAG CGTTAAATGG AGACCATAAT TTACAAGTTG CTAAAACAAA 27480 

TGCGACGCAA GCAATTGATG CTTTAACAAG CTTAAATGAT CCTCAAAAAA CAGCATTAAA 27540 

AGACCAAGTT ACAGCTGCAA CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 27600 

TACGCTTAAC CAAGCAATGC ATGGTTTAAG ACAGAGCATT CAAGATAACG CAGCAACTAA 27660 

AGCAAATAGC AAATATATCA ACGAAGATCA ACCAGAGCAA CAAAACTATG ATCAAGCTGT 27720 

TCAAGCCGCA AATAATATTA TCAATGAACA AACTGCAACA TTAGATAATA ATGCGATTAA 27780 

TCAAGCAGCG ACAACTGTGA ATACAACGAA AGCAGCATTA CATGGTGATG TGAAGTTACA 27 84 0 

AAATGATAAA GATCATGCTA AGCAAACGGT TAGTCAATTA GCACATCTAA ACAATGCACA 27 900 

AAAACATATG GAAGATACGT TAATTGATAG TGAAACAACT AGAACAGCAG TTAAGCAAGA 2 7 960 

TTTGACTGAA GCACAAGCAT TAGATCAACT TATGGATGCA TTACAACAAA GTATTGCTGA 2 8 020 

CAAAGATGCA ACACGTGCGA GCAGTGCATA TGTCAATGCA GAACCGAATA AAAAACAATC 2 8 080 

CTATGATGAA GCAGTTCAAA ATGCTGAGTC TATCATTGCA GGATTAAATA ATCCAACTAT 2 814 0 

CAATAAAGGT AATGTATCAA GTGCGACTCA AGCAGTAATA TCATCTAAAA ATG CATTAGA 2 8 200 
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TCAATTAACA CCAGCTCAAC AACAAGCGCT AGAAAATCAA ATTAATAATG CAACAACTCG 2 8320 

TGATAAAGTG GCTGAAATCA TTGCACAAGC GCAAgCATtA AATGAAGCGA TGAAAGCATT 2 8380 

AAAAGAAAGT ATTAAGGATC AACCACAAAC TGAAGCAAGT AGTAAATTTA TTAACGAGGA 28440 

TCAAGCGCAA AAAGATGCTT ATACGCAAGC AGTACAACAC GCGAAAGATT TGATTAACAA 2 8500 

AACAACTGAT CCTACATTAG CTAAATCAAT CATTGATCAA GCGACACAGG CAGTGACAGA 2 8560 

TGCTAAAAAC AATTTACATG GTGATCAAAA ACTAGCTCAA GATAAGCAAC GTGCAACAGA 2 8 620 

AACGTTAAAT AACTTGTCTA ACTTGAATAC ACCACAACGT CAAGCACTTG AAAATCAAAT 28680 

TAATAATGCA GCAACTCGTG GCGAAGTAGC ACAAAAATTA ACTGAAGCAC AAGCACTTAA 2 8 740 

CCAAGCAATG GAAGCTTTAC GTAATAGCAT TCAAGATCAA CAGCAAACGG AAGCGGGTAG 2 8800 

CAAGTTTATC AATGAAGATA AaCCaCmAAA AGrTGCTTAC CAAGCAGCAG TTCAAAATGC 2 8860 

AAAAGATTTA ATTAATCAAA CTAACAATCC AACGCTTGAT AAAGCACAAG TTGAACAATT 2 8 920 

GACACAAGCT GTTAACCAAG CTAAAGATAA CCTACACGGT GATCAAAAAC TTGCAGACGA 2 8980 

TAAACAACAT GCGGTTACTG ATTTAAATCA ATTAAATGGT TTGAATAATC CGCAACGTCA 2 9040 

25 AGCACTTGAA AGCCAAATAA ACAACGCAGC AACTCGTGGC GAAGTAGCAC AAAAATTAGC 29100 

TGAAGCAAAA GCGCTTGATC AAGCAATGCA AGCATTACGT AATAGTATTC AAGATCAACA 2 916 0 

ACAAACAGAA TCTGGTAGCA AGTTTATCAA TGAAGATAAA CCGCAAAAAG ATGCTTACCA 2 9220 

AGCAGCAGTT CAAAATGCAA AAGATTTAAT TAACCAAACA GGTAATCCAA CACTCGACAA 2 9280 

ATCACAAGTA GAACAATTGA CACAAGCAGT AACAACTGCA AAAGATAATC TACATGGTGA 2 9340 

TCAAAAACTT GCTCGTGATC AACAACAAGC AGTAACAACT GTAAATGCAT TGCCAAACTT 2 9400 

AAATCATGCA CAACAACAAG CATTAACTGA TGCTATAAAT GCAGCGCCTA CAAGAACAGA 2 9460 

GGTTGCACAA CATGTTCAAA CTGCTACTGA ACTTGATCAC GCGATGGAAA CATTGAAAAA 2 9520 

TAAAGTTGAT CAAGTGAATA CAGATAAGGC TCAACCAAAT TACACTGAAG CGTCAACTGA 2 9530 

TAAAAAAGAA GCAGTAGATC AAGCGTTACA AGCTGCAGAA AGCATTACAG ATCCAACTAA 29640 

TGGTTCAAAT GCGAATAAAG ACGCTGTAGA CCAAGTATTA ACTAAGCTTC AAGAAAAAGA 2 9700 

AAATGAGTTA AATGGTAATG AGAGAGTCGC TGAAGCTAAA ACACAAGCGA AACAAACTAT 2 9760 

TGACCAATTA ACACATTTAA ATGCTGATCA AATTGCAACT GCTAAACAAA ACATTGATCA 2 9820 

AGCGACGAAA CTTCAACCAA TTGCTGAATT AGTAGATCAA GCAACGCAAT TGAATCAATC 2 98 80 

50 TATGGATCAA TTACAACAAG CAGTTAATGA ACATGCTAAC GTTGAGCAAA CTGTAGATTA 2 9940 

CACACAAGCA GATTCAGATA AACAAAATGC TTATAAACAA GCTATTGCTG ATGCTGAAAA 30000 
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TGCAAAACAA GCATTAAATG GTGATGAACG TGTAGCACTT GCTAAAACAA ATGGTAAACA 3 012 0 

TGACATCGAC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT 3 0180 

5 CGATCAATCA AACGATTTAA ATCAAATCCA ACAAATTGTA GATGAGGCTA AGGCACTTAA 3 024 0 

TCGTGCAATG GATCAATTGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 3 03 00 

CACGAACTAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 303 60 

W AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA 30420 

ATTAAATGAT GCAGTCACTG CAGCTAAGAA AG CATTAAAT GGTGAAGAAA GACTTAATAA 3 0480 

TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG 3 0540 

15 

ACAATTAGCA ATCCAACAAA TTAATAATGC TGAAACGCTA AATAAAGCAT CTCGAGCAAT 30600 

TAAT AG AG CA ACTAAATTAG ATAATGCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 3 0660 

GCACCTTGGT GTTATCAGCA GCACAAATTA CATCAATGCA GATGACAATT TGAAAGCAAA 30720 

20 

TTATGATAAT GCAATTGCGA ATGCAGCACA TGAGTTAGAT AAAGTGCAAG GTAATGCAAT 30780 

TGCaAAAGCT GAAGCAGAGC AATTGAAACA AAATATTATC GATGCTCAAA ATGCATTAAA 3 0840 

25 TGGAGACCAA AACCTTGCAA ATGC CAAAG A TAAAGCAAAT GCGTTTGTTA ATTCGTTAAA 3 0900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 30 960 

ATCAGATGTA ACAGATATTG TTAATAATCA AATTGACTTA AATGATGCAA TGGAAACATT 31020 

30 GAAACATTTA GTTGACAATG AAATTCCAAA TGCAGAGCAA ACTGTCAATT ACCAAAACGC 31080 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ATGACAGAAT GGGAGCGAGG ACTTAGAATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG 6 0 

TTAGCGATAG mAAATCGTTC ATTAAATGAT GATGAAAAAG CATTAAAATA TGTGCGTAAA 12 0 

GCATTAAATG CAGACCCTAA AAATACAGAT TATATTAACT TAGAAAAAGA GTTGACTAAA 180 

50 TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 24 0 

ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACGAA ATGATTCAAC AGGAGATTTC 30 0 
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TAATCAGAGA AGGAATGAAC AGAAATGACA AAAATTATTT TAGCAGCTGA TGTAGGCGGG 42 0 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 4 BO 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 54 0 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 6 00 

GTACCAGGTC CTGTTGACTT TGAAAAAGGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 66 0 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 72 0 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 78 0 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 84 0 

TGAAATCGTA CATGGTCATA ATGGCTCt GG CGCAGAAATA GGTCATTTTA GAgCAGACTT 900 

CgATCAACGA TTTaAATGTA ATTGTGGTCG TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

20 GACAGGCGTT GTTAACTTAG TTAACTTCtA CTATCCGAAG TTGACGTTTA GATCTTCTAT 102 0 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 108 0 

TGGTGACCAA TTCTGTATTT TCATTACTGA AAAGGTTGCA AACTATATTG GATATTTATG 114 0 

25 TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 12 0 0 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 126 0 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATGCAG GTATTACAGG 132 0 

AGCAGCAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 13 8 0 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA ATATATTGCA 144 0 

GATATTCAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

CATG^ATTAC CTTTTGATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

CGACGAGACA AATCTAGAAA AATGAATGAT AAACTAACAT CAGTACAAAA ACATTTAGAA 16 80 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 174 0 

CGTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 18 00 

45 AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 18 60 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 192 0 

ATATGCATGA AGCAGAGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 19 8 0 

50 TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 204 0 

GTAGCACAGA AATAGAAGGA TTTAAGTTnT nAyrTGTaCA CACACCTGGA CATTCACCAG 210 0 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 22 2 0 

ATAAAATATT TGAATTAGAA GGC 2 24 3 

5 (2) INFORMATION FOR SEQ ID NO: 61: 

fi) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 8009 base pairs 
(3) TYPE: nucleic acid 
w (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



15 



20 



25 



30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

TTGGnATCAT tyAcgGTAAA AAGAATAAaG CAAGATTtAT TTCATTAGTA CTAATTTGTG 6 0 

CAATGTTTGC AATTTGTTGG GTTG CAT AT A TTCAATGGGA GTCTACAATC GCTTCATTTA 12 0 

CACAATCTAT TAATATTTCa ATGGCACAAT ATAGTGTTTT ATGGACAATT AACGGAATAA 180 

TGATTTTAGT AGCACAACCA TTAATTAAAC CGATTCTCTA TCTGTTAAAA GGAAACTTAA 24 0 

AGAAGCAAAT GTTTGTCGGC ATCATCATTT TTATGTTGTC GTTCTTTGTC ACGAGTTTTG 3 00 

CCGAAAACTT TACAATATTT GTTGTCGGTA TGATTATTTT AACTTTTGGA GAAATGTTTG 36 0 

TATGGCCAGC AGTTCCAACT ATAGCCAATC AGTTAGCGCC AGATGGTAAG CAAGGACAGT 420 

ACCAAGGTTT TGTGAATTCA GCTGCTACAG TAGGAAAAGC ATTTGGTCCA TTTCTTGGTG 4 80 

GTGTATTAGT TGATGCGTTT AATATGCGCA TGATGTTTAT CGGTATGATG CTACTACTTG 54 0 

TATTTGCATT AATATTATTA ATGGTTTTCA AGGAGAATAA TACGCAACCT AAAAAAATAG 6 00 

ATGCATAATG AGTAAATAGA ATTAACGTTA TAGACTTGAA ATAAATGTCG TTATAACATA 66 0 

ATATTAATTT GTATAATTTA ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AGTGATGCTG 720 

AGAGCTAGTG TTAAGGACTA AATGTAAATC GTATTAATTT TAAATTGAAT GAATGACATC 78 0 

TCTTACTATT AAAATGAGTG CACAATTTTT GTGAAATAGG GTGGTAACGC GGCAAATGTC 84 0 

GTCCCTATGT AAATAGAATA GTTAGAGGTG TCTTTTTTAT TGAATAGGAG GAAATGTGTT 90 C 

GAATTACAAC CACAATCAAA TTGAAAAGAA ATGGcAAGAC TATTGGGACG AAAATAAAAC 96 0 

ATTTAAAACA AATGATAACT TAGGTCAAAA GAAATTTTAT GCTTTAGACA TGTTTCCATA 102 0 

45 

TCCATCAGGT GCTGGTTTAC ATGTTGGACA TCCTGAGGGc TATACAGCAA CAGATATCAT 108 0 

TTCAAGATAT AAAAGAATGC AAGGATATAA TGTATTACAT CCGATGGGGT GGGATGCATT 114 0 

50 CGGATTACCA GCAGAGCAAT ATGCTTTAGA CACTGGCAAC GACCCACGTG AATTTACAAA 12 00 

GAAAAATATC CAAACTTTTA AACGACAAAT TAAAGAATTA GGGTTCAGTT ATGATTGGGA 126 0 
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GTTATATAAC AAAGGTTTAG CATACGTTGA TGAAGTTGCA GTTAACTGGT GTCCAGCATT 13 8 0 

AGGCACTGTT TTATCTAACG AAGAAGTGAT TGATGGTGTC TCTGAACGTG GTGGACATCC 144 0 

AGTTTATCGT AAGCCGATGA AACAATGGGT ACTTAAAATC ACAGAATATG CAGATCAATT 150 0 

ATTAGCAGAT TTAGATGATT TAGATTGGCC TGAGTCTTTA AAAGATATGC AGCGCAATTG 156 0 

GATTGGACGT TCTGAAGGGG CCAAAGTTTC ATTTGATGTA GATAATACGG AAGGAAAAGT 162 0 

AGAAGTATTT ACGACTAGAC CAGATACAAT CTATGGTGCA TCATTCTTAG TCTTAAGTCC 16 8 0 

TGAACATGCA TTAGTTAATT CAATTACAAC AGATGAATAT AAAGAAAAAG TAAAAGCTTA 174 0 

TCAAACAGAA GCTTCTAAAA AGTCAGATTT AGAACGTACA GATTTAGCAA AAGATAAATC 18 00 

AGGTGTATTT ACTGGTGCAT ATG CAACTAA TCCTTTATCT GGTGAAAAAG TACAAATTTG 1860 

GATTGCTGAT TATGTATTAT CAACATATGG TACTGGAGCA ATTATGGCAG TACCAGCGCA 192 0 

TGATGACAGA GATTATGAAT TTGCTAAAAA GTTTGATTTG CCAATCATTG AAGTCATCGA 19 80 

AGGTGGAAAT GTTGAAGAAG CAGCATACAC TGGTGAAGGT AAACATATTA ATTCTGGTGA 2040 

ACTTGATGGT TTAGAAAATG AAGCGGCAAT TACTAAAGCT ATTCAATTAT TAG AG CAAAA 2100 

25 AGGTGCTGGC GAAAAGAAAG TTAATTACAA ATTAAGAGAT TGGTTATTCA GTCGTCAGCG 2160 

TTATTGGGGC GAACCAATTC CTGTCATTCA TTGGGAAGAT GGAACAATGA CAACTGTTCC 2220 

TGAAGAAGAG CTACCATTGT TGTTACCTGA AACAGATGAA ATCAAGCCAT CAGGGACTGG 2 2 80 

TGAGTCTCCA CTAGCTAATA TTGATTCATT TGTAAATGTT GTAGATGAAA AAACAGGTAT 2340 

GAAAGGACGT CGTGAAACAA ATACAATGCC ACAATGGGCA GGTAGTTGTT GGTATTATTT 24 00 

ACGTTACATC GATCCTAAAA ATGAAAATAT GTTAGCAGAT CCTGAAAAAT TAAAACATTG 24 60 

GTTACCTGTT GATTTATATA TCGGTGGAGT AGAACATGCG GTTCTTCACT TATTATATGC 2520 

AAGATTTTGG CATAAAGTCC TTTATGATTT GGCTATCGTA CCTACTAAAG AACCTTTCCA 2 5 80 

AAAATTATTT AACCAAGGTA TGATTTTAGG AGAAGGTAAT GAGAAGATGA GTAAATCTAA 264 0 

AGGAAATGTA ATCAATCCTG ATGATATAGT ACAGTCTCAT GGTGCAGATA CTTTGCGTCT 2 7 00 

TTACGAAATG TTT ATGGGA C CTTTAGATGC TGCAATTGCA TGGAGTGAAA AAGGATTAGA 276 0 

TGGGTCTCGT CGATTCTTAG ATCGCGTATG GCGTTTAATG GTAAATGAAG ATGGGACATT 2 820 

GAGTTCAAAA ATTGTAACTA CAAATAATAA ATCTTTAGAT AAAGTTTATA ACCAAACTGT 2 38 0 

TAAAAAGGTA ACAGAAGACT TTGAAACATT AGGATTTAAT ACTGCTATTA GTCAATTAAT 2 94 0 

SO GGTATTTATT AATGAGTGTT ATAAAGTTGA TGAAGTTTAT AAACCTTACA TTGAAGGCTT 3 00C 

CGTTAAAATG TTAGCACCTA TTGCACCACA TATCGGTGAA GAATTATGGT CAAAATTAGG 3 06 0 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA TTAAAATTGC 3180 

TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTGCCTTA TCTAATGACA ATGTTAAAGC 3 24 0 

GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT 3 3 00 

TGTAGCTAAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA CAGATGAATT 3 36 0 

AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA CTGATGAAGA 34 20 

AACAGCAATG GG A TAT ATT C CTAATGCAAA GTTAATTCCA ATGGATACCA TTCCGGATAA 34 BO 

TTTAAATTCA TTTAATAAAA ATGAAATATA TTATATTGTA TGTGCTGGTG GAGTTCGAAG 3 54 0 

CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG TCGAAGGCGG 3600 

CATGCACGCA TGGGGCGATG AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 3660 

TTAAAATAAT ATTACATTTG T AATGACAC C AAGTAACGTT TCGGTTGCTT GGTGTTTTTT 3720 

GGTATGAATT ACTTTCTGTT ACAAAACAAT CTAAAGCGTT CTTGTTATGT TTTATTAAGA 378 0 

TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 3 84 0 

GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA ATATAGCTAT AAATATAATA 3 900 

25 TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA 3 960 

GAGAACAGAC AACCAGGAGG AAAATGAAAT GAATTTGTTA AAGAAAAATA AATATAGTAT 4 02 0 

TAGGAAGTAT AAAGTAGGCA TATTCTCTAC TTTAATCGGA ACAGTTTTAT TACTTTCAAA 4 080 

30 CCCAAATGGT GCACAAGCCT TAACTACGGA TAATAATGTA CAAAGCGATA CTAATCAAGC 414 0 

AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT AGAGGTTTAG CAAATAGTGC 4 200 

GCAGAATACA CCTAATCAAT CTGCAACAAC CAATCAAGCA ACGAATCAAG CATTGGTTAA 4 2 60 

TCATAATAAT GGTAGTATAG TAAATCAAGC TACGCCAACA TCAGTGCAAT CAAGTACGCC 4 320 

TTCAGCACAA AACAATAATC ATACAGATGG CAATACAACA GCAACTGAGA CAGTGTCAAA 4 3 80 

CGCTAATAAT AATGATGTAG TGTCGAATAA TACCGCATTA AATGTACCAA CTAAAACAAA 4440 

TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG TTCGTCATTC 4 500 

TTCAAATAAA CCAGAGCTAG TTGCAATTGC TGAACCAGCA TCTAATAGAC CGAAAAAGAG 4 56 0 

AAGTAGACGT GCGGCACCGG CAGATCCTAA TGCAACTCCA GCAGATCCAG CGG CTGCAGC 4 6 20 

GOT AG G AAAC GGTGGTGCAC CAGTTGCAAT TACAGCGCCA TATACGCCAA CAACTGATCC 4 6 80 

TAATGCCAAT AATGCAGGAC AAAATGCACC TAACGAAGTG CTGTCATTTG ATGACAATGG 4 74 0 

50 TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 4 8 00 

CTTCACACTA ATCAATGGTG GCAAAGTAGG GGTGTTTAGT CATGCAATGG TAAGAACGAG 4 8 60 
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TCGTATACAT GGAACTGATA CGAATGACCA TGGCGATTTT AATGGTATCG AGAAAGCATT 4 9 80 

AACAGTAAAT CCGAATTCTG AATTAATCTT TGAATTTAAT ACAATGACTA CTAAAAACGG 5 04 0 

TCAAGGCGCA ACAAATGTTA TTATCAAAAA TGCTGATACT AATGATACGA TTGCTGAAAA 5100 

GACTGTTGAA GGCGGTCCAA CTTTGCGTTT ATTTAAAGTA CCTGATAATG TGAGAAATCT 5160 

CAAAATTCAA TTTGTACCTA AAAATGACGC AATAACAGAT GCGCGTGGCA TTTATCAACT 522 0 

AAAAGATGGT TACAAATACT ATAGCTTTGT TGACTCTATC GGACTTCATT CTGGGTCACA 52 80 

TGTTTTTGTT GAAAGACGAA CAATGG AT CC AACAGCAACA AATAATAAAG AGTTTACTGT 534 0 

AACAACATCA TTAAAGAATA ATGGTAATTC TGGTGCTTCT CTAGATACAA ATGACTTTGT 54 00 

ATATCAAGTT CAATTACCTG AAGGTGTTGA ATATGTGAAC AATTCATTGA CTAAAGATTT 54 60 

TCCAAGTAAC AATTCAGGCG TTGATGTTAA TGATATGAAT GTTACATATG ATGCAGCAAA 552 0 

TCGTGTGATA ACAATTAAAA GTACTGGAGG AGGTACAGCA AACTCTCCGG CACGACTTAT 558 0 

GCCTGATAAA ATACTCGATT TAAGATATAA ATTACGTGTA AATAATGTGC CGACACCAAG 5640 

AACAGTAACA TTTAACGAGA CATTAACGTA TAAAACATAT ACACAAGATT TCATTAATTC 5700 

25 AGCTGCAGAA AGTCATACTG TAAGTACAAA TCCATATACT ATCGATATCA TCATGAATAA 5760 

AGATGCATTA CAAGCCGAAG TTGACAGACG TATTCAACAA GCTGATTATA CATTTGCGTC 5820 

ATTAGATATC TTTAATGGTC TGAAACGACG CGCACAAACG ATTTTAGATG AAAATCGTAA 5 8 80 

30 CAATGTACCA TTAAATAAAA GAGTTTCTCA AG CAT AT ATT GATTCATTAA CTAATCAAAT 5940 

GCAACATACG TTAATTCGAA GTGTTGATGC TGAAAATGCA GTTAATAAAA AAGTTGACCA 60 00 

AATGGAAGAT TTAGTTAATC AAAATGATGA ATTGACAGAT GAAGAAAAAC AAGCAGCAAT 6 060 

ACAAGTTATC GAGGAACATA AAAATGAAAT AATTGGTAAT ATTGGTGACC AAACGACTGA 6120 

TGATGGCGTT ACTAGAATCA AAGATCAAGG TATACAGACC TTAAGTGGGG ATACTGCAAC 618 0 

ACCGGTTGTT AAACCAAATG CTAAAAAAGC AATACGTGAT AAAGCAACGA AACAAAGGGA 6 24 0 

AATTATCAAT GCAACACCAG ATGCTACTGA AGACGAGATT CAAGATGCAC TAAATCAATT 63 0 0 

AGCTACGGAT GAAACAGATG CTATTGATAA TGTTACGAAT GCTACTACAA ATGCTGACGT 63 6 0 

TGAAACAGCT AAAAATAATG GCATCAATAC TATTGGAGCA GTTGTTCCTC AAGTAACTCA 642 0 

TAAAAAAGCT GCAAGAGATG CAATTAACCA AGCAACAGCA ACGAAAAGAC AACAAATAAA 64 8 0 

TAGTAATAGA GAAGCAACTC AGGAAGAGAA AAATGCAGCA TTGAACGAAT TAACTCAAGC 654 0 

50 AACCAACCAT GCTTTAGAAC AAATCAATCA AGCAACAACA AATGCTAATG TTGATAACGC 6 600 

CAAAGGAGAT GGTCTAAATG CCATTAATCC AATTGCTCCT GTAACTGTTG TTAAGCAAGC 6 660 
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TGA7GCGACT CAAGAAGAAA GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC 67 80 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG TAAAGACAAA 6 84 0 

TGCGATTCAA GGAATACAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGCAAAAAA 6 900 

TGCCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6 960 

GCTCGAAGAA CAACAAGCAC CACAACAATT ACTTGA7CAA GCTGTAGCCA CAGCGAAGCA 7 020 

AAATATTAAT GCAGCAGATA CGAATCAAGA AGTTGCACAA GCAAAAGATC AGGGCACACA 7 0 80 

AAATATAGTA GTGATTCAAC CGGCAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 714 0 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCTACA ACTGGCGCGA CTCGAGAAGA 7200 

GAAACAAGAA GCGATAAATC GTGTCAATAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 7260 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 7320 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGCTACA GGTGTATTAA ATGATTTAGC 7 3 80 

20 

AACTGCTAAA AAGCAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 7440 

GGCTTTAAAT CAAGTGGATC AAGAGTTAGC AACGGCAATT AATmATATAA ATCAAGCTGA 7500 

25 TACAAATGCG GAAGTAGATC AAGCGCAACA ATTAGGTACA AAAGCAATTA ATGCGATTCA 7 560 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 762 0 

ATTAGCTGAA ATCAATGCTA CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 7 6 80 

30 TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAACAAGCTA ACACAAATGC 7 74 0 

AGAAGTAGAC CAAGCTGCGA CAGTAG CAGA GAATAATATC GATGCTGTTC AAGTTGATGT 7 BOO 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTGCTGAA GTGGcGAacG TATTGaAGCG 7 860 

GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA 7 920 

TCAACTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7 980 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8 009 
(2) INFORMATION FOR SEQ ID NO: 62: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 
£CJ STRANDEDNESS : double 

(D ) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 62: 
ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 
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AGATGAATGC 


TAACCATATT 


CATTCTGCTA 


AAGATGGTCG 


TGTTACTGCG 


ACAG CTGAAA 


180 




TTATTCATCG 


AGGTAAGTCG 


ACACATGTAT 


GGGATATAAA 


AATTAAGAAT 


GACAAAGAAC 


240 


5 


AATTAATTAC 


AGTTATGCGT 


GGTACAGTTG 


CTATTAAACC 


TTTAAAATAA 


AAGAACTGCT 


300 




AGCTGAAATG 


TTATGAGATA 


TTCATAACTA 


CGGCTAGCAG 


TTTTTTTATG 


CGCTATATTG 


360 




TTGTAGTTTT 


AGAAATGCTT 


GTTCAATGCG 


TTCGGCAGCT 


TTACGGCCAC 


CCATAACATT 


420 


10 


TCTACCAAAT 


GGTCCTAATT 


CTAAGTCTGC 


AAAGCATCCT 


GCGACAAATA 


GATTTGGTAT 


480 




CCATTCTAAT 


TTTTCGGAAA 


TAACAGGGTA 


ATTACATTCG 


TTGATAGGTG 


CATCATAATT 


54 0 


15 


TTGTATTAAT 


TGCTTAATAA 


GTGGTTGTGA 


CATAAAATCT 


TGTTCAAAAC 


CAGTTGCAAC 


600 


CATAATCTGT 


TGATATGGAA 


CAGAATCATT 


TTCAGTGTTA 


ATTACACCAC 


CACTAATTTG 


660 




AGTGATAGGT 


GTTTTATGCa 


CATTTATACG 


ACCATTTTTA 


ATATGTTTTT 


TAAGGCGTAA 


720 


20 


GTACAGTTCG 


TGAGGCATTG 


ATCCTTTATG 


ACGTTCGCGT 


TGTACAATGG 


CATTTCTTTC 


780 




AGGCATGCTT 


TTAGTACTTA 


AAAATGAAGA 


CATATTTTTC 


GGACCTAACC 


AACCAGGATC 


840 




AGCATCAAAG 


TCATGTATTT 


CAATATCTTT 


ATTTAGCCAT 


AAATGAATCT 


TTTTATCGTT 


900 


25 


ATCATGATTT 


AACAATTTAA 


GTGCAAGATG 


TGCAGCAGTa 


ATGCCGCTAC 


CAACGATATG 


960 




ATCGGTCTTA 


TCATATACTA 


CTTGATCAAG 


TTCTTTCTCG 


AAGATATGAT 


TTACATTCTG 


1020 




TTTGTCTTTT 


AAAATGTCAG 


GCATAAACGG 


AATATTTGTA 


CTGCCTATTG 


CAATAACGAC 


1080 


30 


GCAATCTGTA 


GTGATAATTT 


GTCCATCTTC 


TAACTTGATA 


TGCCATTTGT 


CTTCTTGTTT 


1140 




ATCTAAAGTT 


TGAACTAAAC 


CTTGAACCAA 


GCAATCCTCT 


AATTGATATT 


GTTTAGAAGC 


1200 




ATGTGCAATA 


TGATCCATAA 


ACATTGTCAA 


TTCAGGTCGT 


TGATAAGGAC 


CATAAAAAGC 


1260 


35 


ATTTGTATAT 


TGGTGCTGTT 


TAGCGAATTG 


TTTTAGATGG 


AACGGTTGTG 


GATGTACGTG 


1320 




ATGTACAATC 


GGTGATCTTA 


AATAAGGCAT 


TTCTATTCGA 


TTTGTATATG 


AGTTAAACCT 


1380 


40 


TTGGCAAAAA 


GTTTCGTGTG 


GGTCAATGAT 


TGTTAATCGG 


TCTGTTGTTA 


ATCCGCTTGA 


1440 


TAATAGTTTT 


TGTGCGATTG 


CAGTTCCCTG 


TATGCCACCG 


CCGATAATTG 


TCCAATGCAT 


1500 




AATAAAACCT 


CTCTCTTTTT 


AAAACGTAAT 


AGTTACGATT 


TATAATTATT 


ATT AT CAT AA 


1560 


45 


TACATAACGA 


CATGAAAGGC 


AATTAAATTA 


AAGAGATATA 


TGTAGATAGG 


GCGAATCTGT 


1620 




AGTCAAAGAA 


AAAATCATTG 


AAAAAGAGGT 


AACAATGTCA 


AAAGAwAACA 


GCAGTAAAAT 


1680 




CATTCCTAAT 


TTGGAATCAT 


CTTACTGCTG 


TTTGTTGTTG 


ATTTATATTC 


ATGATTTTGT 


1740 


50 


TATATAATCT 


ACAATTTTGT 


GTCTTTTAAG 


TCTTCCGAAA 


TTTCATCGAC 


TTTAGTCTTT 


1800 




TTAGTATAAG 


GCGTTTTAAT 


ATTATATGCT 


GCTTTCATAA 


TCATATGACT 


TGAAAGAGGA 


1860 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT 
CCGGCAGCAT GTGCACGTGA ATATACATCT 
5 GCGCTAATTA AAGCACCGAT GATAACAAAG 

ATCATGTTCA ATCACCTTAC CTTTGTCCAT 
AGCTAATATA CCAATCATCA TAATAACGAC 

10 

GAATAATGCT ATAACTGCCA TTAATTGAAG 
GGCAAGTGAT GGGCCTAGCA CAACGCGAAT 
GATTAATGCA ATAACGATAA TAACATTATG 

15 

ATTTTCTCTA ATGATGTTTT AATACTTTCT 
GCATGAATAT AAATTTTTGT ACGATCGTCA 
AATGTAATTA AATTAGACAG CAAGACAATT 

20 

ACAAAGAATC CTGGTTCATT TTTAATCGAA 
TTAGCTTTAA TCAGTTCGAT TAAGAAAATA 

25 ATGACATAAA ATCTACCTGG TAACACTCTG 

ATGAAACCTA ACACAAAGTT ATTTGTTGTG 
GCGATAATAA AGTTTAATAC TAATTGTACA 

30 AACGTAGGTT GATGGATTGT AGAATGTTTC 

ATCTGCTGAC AATCCATATA AAACAGTTAT 
ATATTTGACG TCGACTTTGT TATTAAGATC 

35 TAGGAATATG CGAATGACAG AATATAATAC 

ACTTAAATAA AATCCTCTTT CAAATGTTGA 
ACTGAGTGGG GGAATGCCAG CTAAACTTAA 

40 

AGGATATCGT TTAATTAAGC CACCAAATTG 
CATAATTCCG ATAAGCAAGA ATAATGCAAG 
AATAGCCCCA ATCATACCTG ACTCTGTCAT 
AGCAATCATG ACATTGTATA GGATGATTTT 
ACAACCAAAG ATGATCGTTA ATAGTGCTAA 
50 ATTATCACTA AAGAATAGGC TCAATGTTCT 

CAAAGCACCA AAGAATGCAA TGATTGGAAT 



AATGACATTG CACCTAATGT TGATGCTTTT 19 80 

TCAAGTCTCA ATAATCCTAT AGCTGCTAGG 2C4 0 

ATAAGTGCAA GACTAATCAG TATGATTTTG 2100 

AAATTTAGAG AATACTGCAG TACCTAAAAA 2160 

AATCATGTAT TTAATATTTA ATAAAATACT 222 0 

ACCAATCGCA TCTAATGCGA CAACACGATC 2280 

GAG CAT AG CT AACATAGAAA TGACAACTAT 23 4 0 

ATTCATTATA TTTCGCCCAC CTCTCTTACA 24 00 

ACTTCTTGCT CTTTAGTTGA AAAATCTATG 24 60 

CTTACACCAA GCACTACAGT ACCAGGTGTT 2520 

TGCCAATCTT TTTTTAAATC TGTGTGATAA 258 0 

GGTTTAATAA TAATTTTCAA AACATCAAAA 2 64 0 

ATAACTAATT TAATAATACG ATATAGCGTG 27 00 

TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 27 60 

TAACTATTTG TCACAAACAA CCAAAACACT 282 0 

GCCATGTTAT TTACCTCCTA ATACAGCTTT 28 30 

TGCACCAGCT TTTACCATTG GATATAAGTA 294 0 

CACAACTGCA ACGATTGCAA TCGTAGTTAA 3 0 00 

ATATCCTTTT GGTTGACCGA AAAAGCCTTG 3 0 60 

GACTAAACTT GATAATAAGA CGATGACACC 312 0 

TTGGACAATA AAAAATTTTC CATAAAAGCC 318 0 

TGCTGCGATA AAGAATGACC AACCAAGTAC 3 24 0 

TCTTAAATCA GCAGTGCCTG TAATTTTAAT 3 3 00 

TTTTACTAAC ATGTCGTGCA ATGTATAGTA 336 0 

CATTGCAACG CCGACTAAGA TCACACCTAC 342 0 

TTTAATGTTG GCATATGCAA CAGCACCGAC 34 3 0 

GAATAAAATG ACATAATGTG AAAAGCTTAC 3 54 0 

AGCGATTGCA TAAACACCAA CTTTTGTTAA 3 60 0 

TGGTGGgCAT AGTATGCACT AGGTAACCAA 3 66 0 
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ATATTGACTA 


AGCCACTGTC 


ATGCGCTGAA 


AGGTTAGCTA 


ATTTATTGCT 


TATATCTGCT 


3780 




AGATTCAATG 


TTCCTACTAC 


TGAATATAAA 


ATCGCTACAC 


CCATTACGAA 


GAAGGATGAC 


3840 


5 


GATACAACGT 


TAACAAGAAC 


ATATTTTATT 


GTTTCTTGTA 


GTTGAATTTT 


TGTAGAACCA 


3900 




ATTACTAATA 


AGAAATAAGA 


TGACATTAAA 


AATACTTCGA 


AAAATACGAA 


TAGGTTGAAA 


3960 




ATGTCACCAG 


TTGTGAATGC 


ACCAATGATA 


CCTATTAACA 


TAAATAGTAC 


TGAAAAATAA 


4020 


10 


TAATAATATC 


TTTCACGTTC 


AATACCAATT 


GTTTGGTATG 


AATATAAAAT 


CACAATAGCT 


4080 




GTAATAATAA 


TACTAGTAAT 


TATTAGTAGG 


GCACTGAATA 


TGTCTAATAC 


AAAGACAATA 


4140 


15 


CTGTATGGTG 


CTTTCCATGA 


ACCTAGCTCT 


ACGCGTATTG 


GTCCATGTTT 


AACAACATTT 


4200 


GCTAAATTGA 


TAATTGCCGC 


GACCAAGGTT 


AATAATGTAC 


CGCCTAGTGC 


GACATAACGC 


4260 




TTTATAATAG 


GACGCTTTCC 


AATAAAGACA 


AGTAATATGG 


CTGTAATTAC 


TGGAATAACT 


4320 


20 


AGCGTTAACA 


CAAGCATATT 


ACTTTCAATC 


ATCTTCTGGA 


ACTCCTTTCA 


TACTCTCAAC 


4380 




GTTATCTGTG 


CCTAATTCTT 


TATATGTTCT 


AAATGCTAAT 


ACTAAGAAAA 


AGGCTGTTGT 


4440 




CGCAAgGCGA 


TAACGATTGC 


TGTTAAAATA 


AGTGCTTGCG 


GGaTAGGaTC 


AACATAGCTT 


4500 


25 


TTTACGTTCG 


CTTCATAAAT 


TGGAACAGTA 


CCATGTTTAA 


GTCCGCCCAT 


AGTTATTAAA 


4560 




AATAAATTTG 


CTGCATGTGT 


TAATAGTGTA 


GTTCCCATAA 


CAATTCGTAT 


CAGACTTTTA 


4620 




GACAAAACGA 


GATAGACACT 


AATTGCTGTG 


AGAATACCAC 


TAACAAAAAT 


CATAATAATT 


4680 


30 


TCCACTATTC 


GTTCTCTCCA 


ATCGAAATAA 


TAATTGTCAT 


GACAGTACCA 


ACTACTGCAC 


4740 




ATAAAACACC 


GAAATCAAAG 


AATACTGCTG 


TTGTCATATG 


AACAGGTTCT 


AATATAAATA 


4800 




ACGGTATATC 


AAATGTGACA 


TGCGTAAAGA 


AATTTTTGCC 


TAAAAACCAA 


CTTGCGATAG 


4860 


35 


GCGTCGCAAT 


ACAAAAAACT 


AATCCGATAC 


CTATCAAGAT 


TTTAAAATCT 


AATGGGAAAA 


4920 




TTTTACGCAT 


TGTTTCTATA 


TCAAATGCAA 


TCGTAATGAT 


AACAAGTGAA 


CTTGCGAATA 


4980 


40 


ATAATCCGCC 


GACGAAACCG 


CCACCAGGTG 


TATAATGTCC 


TGCTAAGAAA 


AGTGAAAAAC 


5040 


CAAAGACCAT 


TACCATGAAA 


AAGATAATAA 


CTGCAGCAAA 


TTGCAAAATT 


AGATCATTTT 


5100 




GTTGTCTATT 


CATGATTTTT 


CACCTCGTTA 


CCTTGCGTTT 


GACGCTTTTT 


ACGTAATTTA 


5160 


45 


ATCATTGTAT 


ATACAGCTAA 


TCCTGCGATA 


CCAAGCACAG 


ATGACTCGAA 


TAAAGTATCC 


5220 




ATACCACGGA 


AATCAACAAG 


TATGACGTTT 


ACCATGTTTT 


TACCGTGAGC 


tAAATCATAA 


5280 




ACGTGCTCTT 


GATAAAACTT 


AGATATCGAT 


TCAAAATGTC 


TATTTCCGTA 


TGCAATTAAA 


5340 


50 


CCGATAATAA 


TGACGGACAA 


ACCAACACCA 


CCAGCAATTA 


AAGCATTAGT 


AAGCTGGAAT 


5400 




GAGCGCTTTT 


CATTATAACG 


ATTTAAATTT 


GGTAAGTGGT 


AGAAGCATAA 


TAAGAACAAT 


5460 
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ATAAACAATA CAGACACAGC ATATCCAACT GCACTTAACA TAATGATGCT AAATAATCTT 5 58 0 

GATTTAGCGA AAAGAATTAA AAAGGCAGCA CTTAATAATA AAATTACGAT ACAAACTTCG 564 0 

AAAATTCTAA TCGGACTAAC GTCTTTAAAA TTAATGTTGA AAGGTACTGA GAATATAGTG 570 0 

ACAAATGTTA ATAAAATTAA TGCACCAAAA ATGATAACTA AATTATTACG TGAATAATCG 576 0 

GTAACATAGC TATTCGTCAT CTTTTCAGAG TAGTTTGGAA TAACATTTGC ACTTCTGTTG 582 0 

TACCAATAAT TGAATGTTAG TTTACCAGGT TGTCGTTGCA ACAATTTCAC CCAATAACTA 5380 

AATGTCACAA TTAGTAAGAT ACCTAAAATA TAAATCACTA ATGTTGATAA AAAGGCAGGC 594 0 

GTTAATCCAT GGAACATATG GAATTCAACA TCATCAATTA CCGTATGATT AATCGAAGag 6 000 

TnAGCTGGTT CAATAATCGA ATTAGTTAAA ATGCCAGGGA ATAAACCAAA TACAATTACT 6060 

AATGTAGCTA AAATAGCTGG TGATAAAAGC ATTAATATTG ATACTTCGTG TGCTTTTTTA 612 0 

GGTAATTGTT CAGGTTTATA TTGTCCGAAA AATATATGCA TTATAAATTT AATTGAATAT 6180 

ACAAATGTGA AGACACTGCC CACTATACCA ATGATTGGGA ATAGGTAGCC TAATGTATCA 624 0 

ACACTGAATA AATTTGCTTG GCTTGCTGTA AATGTTGTTT CTAAAAATGA TTCTTTTGAT 63 00 

25 AAGAAACCAT TGAACGGTGG TACACCAGCg CATACTTAAT GCTGTAATAA CAGTGATTGT 6360 

AAATGAAATA GGCATAATTG TTAGTAAGCC ACCTAATTTC TTAACATCAC GTGTACCAGT 6420 

AGAATGATCC ACTGCACCTG TAATCATAAA TAGGGCACCT TTAAATGTTG CATGGTTGAT 6480 

30 TAAATGGAAT ATTGCAGCCG TAAATGCAGC AGCATATATT TTGCTATCAT CGCCTTGATA 654 0 

GTGATAACTA ATGGCACCGA TTCCAAGCAT CGCCATAATC ATACCTAATT GGGATACTGT 6600 

TGAAAATGCC AGTATACCTT TCAAGTCTTG TTGTTTTGTT GCGTTTAGCG AAgCCCAGAA 66 6 0 

TAATGTAATT AAACCAACGA GTGTGACAGT CCATACCCAA CCTTGCGATG CTGCGAAGAT 6720 

TGGTGTCATT CGAGCGATTA AATATAACCC TGCTTTAACC ATTGTTGCTG AATGAAGATA 6780 

AGCACTGACT GGTGTAGGTG CTTCCATTGC ATCTGGTAGC CAAATATAAA ATGGAAACTG 684 0 

AGCAGATTTT GTAAAAGCAC CAATCATGAT TAAAATCATC GCAAAAATGA AGAATGGGCT 6900 

ATTTTGAATT TCAGAAGCAT GTTGAATCAT GTACTGAATG CTAAATGATT GTGTTGGTAT 6 96 0 

AGCGAGTAAG ATGATACCAC CTAATAATGA TAGACCACCA AATACTGTGA TTATGAGCGA 7 02 0 

4$ 

ri'lTl'GAGCA CCATATATAG ATGCTTGTCG TTCGCGCCAG AATGAAATAA GTAAAAAACT 70 80 

AGAAAATGAC GTTAGCTCCC AGAATAAATA TAGAATAATA ACATTATCTG AAAGTACGAC 714 0 

50 ACCTAACATT GCACCCATAA ATAGTAATAA ATAACAATAA AAATTCCCTA GTTGTTCTGA 720 0 

CTTACTTAAG TAGCCGATTG AATATAATAC TACTAAACTG CCGATTCCTG AAATAAGCAA 72 6 0 
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CCAATTTAAG GTTTTCATTA CAGTATTACC TGACATCGTC GTTTTAATTA ATGTAAGCAT 73 3 0 

ATAAATAAAT ATGACGATAG GGACAGGTAA TACGAACCAT CCTAAATGTA TACGTTTAAA 744 0 

5 AAATCTATAC AGGATAGGAA TAATGAGTGC GAATATTAAC GGTAATATCA CCGCAATATG 75 00 

TAACAAACTC ACTATGTTGT CCTCCTTTAA AAAATATTTA TGTTATTCAT TATACATGAA 756 0 

TGATATAGTT CTGAAAAACG TACACACTCC TTGTTGTGCT TTATTTTCAG AaGTATTTAA 7620 

10 

ATAAGAAGAA ACACGTCATT TTTTATTTAA AATTTTCTTT GTATTGAAGT GAATAATCTT 76 30 

CTTTTAAGCG TGCTAAACTA GCTAAAGACA TTTCAGCATG TTTTGTTTGC TGAGCTTTAA 774 0 

GTTTAGTTTC TAAATCTGTA ATTGCTTGTT GAAGTGAATC TTCATAGCGC AATACATCAA 78 00 

75 

CATTGAAGTC GCGTAATTGT GAACGTTTCG TATAGCGTTT TTCAAAATGG CTTAATGCTT 786 0 

TGCGGTCATG GAAAAATACA CCTTCAGTTT CAGTAGGGTT ATGTAAATCA CCTTGTTTCG 7 920 

2Q GGTGTTTGAT AACTTGTTCA ACTTTAACAA GGACATCGTC TCCATTTTCT TCAACAATCG 7980 

TGACACCATA GCTACCTGTT TTGTGTGAAA ATCGATATAG CTTCATGCTA TTTTCCTCCC 804 0 

TTAAAAGTAT GTTAATATAT ATGTATCATA ACATGAATGG AGAATATAAA TGGCTAACTA 810 0 

25 TCCACAGTTA AACAAAGAAG TACAACAAGG TGAAATCAAA GTGGTTATGC ACACAAATAA 816 0 

AGGTGACATG ACATTCAAAT TATTTCCAAA TATTGCACCA AAAACAGTTG AAAATTTTGT 822 0 

GACACATGCA AAAAATGGTT ATTATGATGG AATCACATTC CACCGTGTCA TTAATGACTT 82 8 0 

30 CATGATTCAA GGTGGCGATC CAACAGCTAC TGGTATGGGT GGCGAAAGTA TTTATGGCGG 834 0 

TGCTTTTGAA GATGAATTTT CATTAAATGC ATTTAACTTA TATGGCGCAT TATCAATGGC 8400 

TAACTCAGGA CCTAATACTA ATGGTTCACA ATTTTTCATT GTTCAAATGA AAGAAGTACC 846 0 

35 

TCAAAATATG TTAAGTCAAC TTGCAGATGG TGGCTGGCCT CAACCAATCG TTGATGCATA 8 52 0 

TGGCSAAAAG GGTGGTACAC CATGGTTAGA TCAAAAACAT ACAGTATTCG GTCAAATCAT 858 0 

TGATGGTGAA aCTACATTAG AAGATATTGC AAATACAAAA GTGGGACCAC AAGATAAACC 8 64 0 

40 

ACTTCATGAT GTTGTAATTG AATCTATTGA TGTTGAAGAA TAATATCTAA ACATAATTAA 8 70 0 

CTACCAACAT TTTAAACTCG GATAAAGCTA ATTTATGAAT GGATTAGTAT ATATTCCAAC 8 76 0 

45 gAAAATAAAT AAACTAATAT GATGAGCAAT CTCAATATAT TTATCaAGAA AGCACAGTTT 8 82 0 

TTAAATAGAT GTGTATTTTA AAGATAATAG TTGAGGTTGC TTTTTATGTT TTTACAGAGA 888 0 

ATTGCTATTC AAATAGTAAA TAAATTGAAA ACAAAGTAGC TGGATATCAT ATTGATTTAG 8 94 0 

50 ATAGGAATTT GTTGCTAATT TTATTTGTAA ATCCAAGTTT GTAGAATTCT TATTCATTTA 90 0 0 

TAAAATAATA TTCGTATGAT TTGATTTTTT AATTAGTCCA CCATTTCGAT TTGTGCTATG 906 0 
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AACATATCAA GGTGCGTGTA cTGGTATTCA ACCATACGGT GCGTTTGTTG AGACCCCTAA 9130 

TCATACTGAA GGACTGATTC ATATATCAGA AATTATGGAT GACTACGTTC ATAATTTGAA 92 4 0 

GAAATTTCTA TCAGAAGGCC AAATTGTTAA AGCTAAAATT TTGTCTATAG ATGATGAAGG 93 0 0 

AAAGCTTAAT CTATCATTAA AGGATAATGA TTACTTCAAA AATTATGAGC GTAAGAAGGA 936 0 

AAAACAATCA GTATTAGATG AAATCAGAGA AACAGAAAAA TATGGGTTTC AAACACTTAA 942 0 

AGAACGCTTA CCAATCTGGA TAAAACAGTC AAAGCGAGCA ATT CG AAA CG ACTAAAGGAA 94 80 

CAGATAAATC GTACCGAAAA TCATACAAAG GGTCTGAAAT GAAAGTTTCT TAGACTATAA 954 0 

AAGAGATTAG TATCTATTAA ATTTTATTAG ATACTAATCT CTTTTTGTCT ACGATAACGT 9600 

AATATGaTTG ATTCTATTTA CACGTACAAA TGGTTTAAGG TGACATATCC ATTATCTTTG 9660 

TTAGATAGAA TCGTTGATTT GCaATATTGT ATGTGGATTT GTTTTTTTTA TTTATTTTAG 9720 

AAATGAGAAC TACAACTTAA AGTATTAAAC GAATTGCAAC TATATAAACA GATAATTGGA 9780 

GAATGAAAAA ATTACATGTT ATAGTCAACT CAATAATTTT AAGGAGGAAT TAAGTAATGA 984 0 

AAAGTAAATA CGAACCATTG TTTGATAAAG TAGAATTACC AAATGGAGTA GAGTTGAGAA 9900 

25 ATCGATTTGT GTTAGCCCCT TTAACACATA TTTCTTCAAA TGATGATGGT ACTATTTCAG 996 0 

ATGTAGAACT TCCTTATATT GAAAAGCGTT CACAAGATGT TGGTATTACA ATTAATGCTG 10020 

CGAGTAATGT GAGTGATGTC GGAAAAGCAT TTCCAGGACA GCCATCAATC GCGCATGACA 10080 

30 GTAATATTGA AGGACTAAAA CGATTAGCTA CAGCAATGAA GAAAAACGGT GCCAAAGCAC 1014 0 

TCGTACAAAT ACATCATGGC GGTGCACAAG CATTGCCTGA ATTAACACCT GATGGAGACG 10200 

TCGTAGCACC AAGTCCAATT TCTTTAAAAA GTTTTGGTCA GAAACAAGAA CATAGTGCTA 10260 

GAGAAATGAC GAATGAAGAG ATTGAACAAG CAATCAAGGA TTTTGGTGAA GCAACGCGAC 10320 

GTGCAATTGA AGCAGGGTTT GATGGTGTTG AAATACATGG CGCGAATCAT TACTTAATTC 10380 

ATCAATTTGT ATCACCATAC TATAATAGAA GAAATGATGT ATGGGCAAAT CAATATAAAT 10440 

TCCCGGTCGC TGTGATTGAA GAAGTACTTA AAGCGAAAGA AGCGTATGGC AATAAAGACT 10 500 

TTATAGTTGG ATACAGATTA TCTCCAGAGG AAGCGGAGTC TCCAGGAATC ACAATGGAAA 10 560 

TTACAGAGGA ACTCGTTAAT AAAATTAGCC ATATGCCAAT CGACTATATT CATGTTTCAA 106 20 

TGATGGATAC GCATGCAACG ACACGTGAAG GTAAATACGC TGGACAAGAA AGACTGCCTT 10680 

TAATTCACAA ATGGATAAAT GGTCGTATGC CACTTATCGG TATTGGTTCA ATTTTCACAG 10740 

50 CTGACGAAGC TTTAGATGCA GTTGAAAATG TTGGTGTTGA CTTAG TAG CC ATTGGTAGAG 10800 

AGCTACTACT GGATTATCAA TTTGTTGAAA AAATTAAAGA TGGACGGGAA GATGAAATTA 10 860 
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AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 10 953 
(2) INFORMATION FOR SEQ ID NO: 63: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

w 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

TTTGATAnAA AACTGAATnA ATTAAATGTA TCGATTCAAC CTAATGAAGT GAATTTACAA 6 0 

15 

GTTAAAGTAG AGCCTTTTAG CAnAAAGGTT AAAGTAAATG TTAAACAGAA AGGTAGTTTA 12 0 

GCAGATGATA AAGAGTTAAG TTCGATTGAT TTAGAAGATA AAGAAATTGA AATCTTCGGT 180 

20 AGTCGAGATG ACTTACAAAA TATAAGCGAA GTTGATGCAG AAGTAGATTT AGATGGTATT 24 0 

TCAGAATCAA CTGAAAAGAC TGTAAAAATC AATTTwCCAG AACATGTCAC TAAAGCACAA 3 00 

CCAAGTGAAA CGmAGGCTTA TATAAATGTA AAATAAATAG CTAAATTAAA GGAGAGTAAA 36 0 

25 CAATGGGAAA ATATTTTGGT ACAGACGGAg TAAGAGGTGT CGCAAACCAA GAACTAACAC 42 0 

CTGAATTGGC ATTTAAATTA GGAAGATACG GTGGCTATGT TCTAGCaCAT AATAAAGGTG 4 80 

AAAAACACCC ACGTGTACTT GTAGGTCGCG ATACTAGAGT TTCAGGTGAA ATGTTAGAAT 54 0 

30 

CAGCATTAAT AGCTGGTTTG ATTTCAATTG GTGCAGAAGT GATGCGATTA GGTATTATTT 60 0 

CAACACCAGG TGTTGCATAT TTAACACGCG ATATGGGTGC AGAGTTAGGT GTAATGATTT 660 

CAGCCTCTCA TAATCCAGTT GCAGATAATG GTATTAAATT CTTTGGATCA GATGGTTTTA 72 0 

35 

AACTATCAGA TGAACAAGAA AATGAAATTG AAGCATTATT GGATCAAGAA AACCCAGAAT 780 

TACCAAGACC AGTTGGCAAT GATATTGTAC ATTATTCAGA TTACTTTGAA GGGGCACAAA 84 0 

AATATTTGAG CTATTTAAAA TCAACAGTAG ATGTTAACTT TGAAGGTTTG AAAATTG CTT 900 

40 

TAGATGGTGC AAATGGTTCA ACATCATCAC TAGCGCCATT CTTATTTGGT GACTTAGAAG 96 0 

CAGATACTGA AACAATTGGA TGTAGTCCTG ATGGATATAA TATCAATGAG AAATGTGGCT 102 0 

45 CTACACATCC TGAAAAATTA GCTGAAAAAG TAGTTGAAAC TGAAAGTGAT TTTGGGTTAG 108 0 

CATTTGACGG CGATGGAGAC AGAATCATAG CAGTAGATGA GAATGGTCAA ATCGTTGACG 114 0 

GTGACCAAAT TATGTTTATT ATTGGTCAAG AAATGCATAA AAATCAAGAA TTGAATAATG 12 0 0 

50 ACATGATTGT TTCTACTGTT ATGAGTAATT TAGGTTTTTA CAAAGCGCTT GAACAAGAAG 126 0 

GAATTAAATC TAATAAAACT AAAGTTGGCG ACAGATATGT AGTAGAAGAA ATGCGTCGCG 1320 
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w 



15 



20 



CTGGTGATGG TTTATTAACT GGTATTCAAT TAGCTTCTGT AATAAAAATG ACTGGTAAAT 144 0 

CACTAAGTGA ATTAGCTGGA CAAATGAAAA AATATCCACA ATCATTAATT AACGTACGCG 1500 

TAACAGATAA ATATCGTGTT GAAGAAAATG TTGACGTTAA AGAAGTTATG ACTAAAGTAG 1560 

AAGTAGAAAT GAATGGAGAA GGTCGAATTT TAGTAAGACC TTCTGGAACA aACCATTAGT 162 0 

TCC-TGTCATG GTTGAAGCAG CAACTGATGA AGATGCTGAA aGATTTGCAC AACAAATAGC 16 8 0 

TGATGTGGTT CAAGATAAAA TGGGATTAGA TAAATAAATA CTGTATTACA AATGAGCCGA 174 0 

TGCGTATGcA nTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC GTAAAATGAT 180 0 

ATAAACAAAA TAAAAACAAA GTAATCAATA TGTAATATAA AATACACTGG TACTCAATAT 186 0 

ATAATGATGA TAAAATTAAT TTTAATTAGA TAGAGTTGCT TTGTGTTTTT AACGCAGATG 1920 

CTACTACTTA TCTTAACAGT TGATTAAGTG AAATCATTTA ACAGCGAGAA TAATCAACCA 1980 

GGAGGATGAC TTAATGAATT TATTCAGACA ACAAAAATTT AGTATCAGAA AATTTAATGT 2040 

CGGTATTm TCAGCTTTAA TTGCCACTGT TACTTTTATA TCTACTAACC CGACAACAGC 210 0 

GTCTGCAGCA GAGCAAAATC AGCCTGCACA AAATCAACCA GCACAACCAG CTGATGCCAA 216 0 

25 TACACAGCCT AACGCAAATG CTGGTGCTCA AGCTAATCCT ACAGCACAGC CAGCTGCACC 2220 

TGCCAACCAA GGACAACCAG CAGTACAACC AGCAAACCAA GGTGGACAGG CTAATCCAGC 22 80 

AGGAGGAGCA GCACAACCAA ATACACAACC AGCTGGACAA GGTGATCAAG CTGATCCGAA 2340 

30 TAACGCTGCA CAAGCACAAC CTGGAAATCA AGCAACACCG GCAAACCAAG CAGGTCAAGG 24 00 

AAATAACCAA GCAACACCTA ATAATAATGC AACACCGGCA AATCAAACAC AGCCAGCGAA 2460 

TGCTCCAGCA GCAGCGCAAC CAGCAGCACC TGTAGCAGCA AACGCACAAA CTCAAGATCC 2520 

AAATGCTAGC AATACTGGTG AAGGCAGTAT TAATACGACA TTAACATTTG ATGATCCTGC 2580 

CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT GTTACAGATA AAGTAAATGG 264 0 

TTATTCATTA ATTAACAACG GTAAGATTGG TTTCGTTAAC TCAGAATTAA GACGAAGCGA 2700 

TATGTTTGAT AAGAATAACC CTCAAAACTA TCAAGCTAAA GGAAACGTGG CTGCATTAGG 276 0 

TCGTGTGAAT GCAAATGATT CTACAGATCA TGGTAACTTT AACGGTATTT CAAAAACTGT 2 82 0 

4 , AAATGTAAAA CCAGATTCAG AATTAATTAT TAACTTTACT ACTATGCAAA CGAATAGTAA 2 88 0 

GCAAGGTGCA ACAAATTTAG TTATTAAAGA TGCTAAGAAA AATACTGAAT TAGCAACTGT 294 0 

AAATGTTGCT AAGACTGGTA CTGCACATTT ATTTAAAGTA CCAACTGATG CTGATCGTTT 3 00 0 

50 AGATTTACAA TTTATTCCTG ACAATACAGC AGTTGCTGAT GCTTCAAGAA TTACAACAAA 3 06 0 

TAAAGATGGT TATAAATACT ATTCATTCAT TGATAATGTA GGTCTATTCT CAGGATCACA 312 0 
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TAATACTGAA 


ATCGGTAACA 


ATGGTAATTT 


TGGTGCTTCA 


TTAAAAGCAG 


ATCAATTTAA 


3240 




ATATGAAGTA 


ACATTACCAC 


AAGGTGTAAC 


TTACGTTAAT 


AATTCATTAA 


CTACAACATT 


3300 


5 


CCCTAATGGT 


AATGAAGACA 


GTACAGTATT 


GAAAAATATG 


ACTGTTAATT 


ATGATCAAAA 


3360 




TGCAAATAAA 


GTTACATTTA 


CAAGCCAAGG 


TGTGACAACG 


GCACGTGGTA 


CACACACTAA 


3420 




AGAAGTTTTA 


TTCCCAGATA 


AATCTTTAAA 


ATTATCATAT 


AAAGTTAATG 


TTGCGAATAT 


3480 


10 


CGATACACCT 


AAAAATATTG 


ATTTTAATGA 


AAAATTAACA 


TATCGTACTG 


CTTCAGATGT 


3540 




TGTAATTAAT 


AATGCGCAAC 


CAGAAGTaCA 


CTAACTGCAG 


ATCCATTTTC 


AGTAGCGGTT 


3600 


15 


GAAATGAACA 


AAGATGCGTT 


GCAACAACAA 


GTAAACTCAC 


AAGTTGATAA 


TAGTCATTAC 


3660 


ACAACAGCAT 


CAATTGCAGA 


ATACAATAAA 


CTTAAACAAC 


AAGCAGATAC 


TATTTTAAAT 


3720 




GAAGATGCGA 


ATCATGTTAA 


AACTGCAAAT 


CGTGCATCTC 


AAGCGGATAT 


TGATGGTTTA 


37B0 


20 


GTAACTAAAT 


TACAAGCTGC 


ATTAATTGAT 


AATCAAGCAG 


CAATTGCTGA 


ATTAGATACT 


3840 




AAAGCTCAAG 


AAAAGGTTAC 


AGCAGCACAA 


CAAAGTAAAA 


AAGTTACGCA 


AGATGAAGTT 


3900 




GCAGCACTTG 


TAACTAAAAT 


TAACAATGAT 


AAAAATAATG 


CAATCGCAGA 


AATTAATAAA 


3960 


25 


CAAACTACAG 


CACAAGGTGT 


CACAACTGAA 


AAAGATAATG 


GTATCGCAGT 


GTTAGAACAA 


4020 




GATGTGATTA 


CACCAACAGT 


TAAACCTCAA 


GCGAAACAAG 


ATATTATCCA 


AGCAGTTACA 


4080 




ACTCGTAAAC 


AACAAATTAA 


AAAGTCAAAT 


GCATCATTAC 


AAGATGAAAA 


AGATGTAGCA 


4140 


30 


AATGATAAAA 


TTGGTAAAAT 


TGAAACAAAG 


GCAATTAAAG 


ATATTGATGC 


AGCAACAACA 


4200 




AATGCACAAG 


TAGAAGCCAT 


TAAAACAAAA 


GCAATCAATG 


ATATTAATCA 


AACTACACCT 


4260 




GCTACAACAG 


CTAAAGCAGC 


AGCTCTTGAA 


GAATTTGACG 


AAGTTGTTCA 


AGCACAAATT 


4320 


35 


GATCAAGCAC 


CTTTAAATCC 


TGATACAACA 


AATGAAGAAG 


TAGCGGAAgC 


TATTGAACGT 


4380 




ATTAATGCAG 


CTAAAGTTTC 


TGGTGTTAAA 


GCAATTGAAG 


CGACAACGAC 


TGCACAAGAT 


4440 


40 


TTAGAAAGAG 


TTAAAAACGA 


AGAAATCTCA 


AAAATTGAAA 


ATATTACTGA 


CTCTACGCAA 


4500 


ACAAAAATGG 


ATGCCTATAA 


TGAAGTTAAA 


CAAGCTGCAA 


CAGCTAGAAA 


AGCTCAAAAT 


4560 




GCTACAGTTT 


CAAATGCAAC 


AAATGAAGAA 


GTAGCAGAAG 


CTGATGCAGC 


AGTAGATGCA 


4620 


45 


GCTCAAAAGC 


AAGGTTTACA 


TGACATCCAA 


GTTGTTAAAT 


CAAAACAGGA 


AGTTGCTGAT 


4680 




ACAAAATCAA 


AAGTATTAGA 


TAAAATCAAT 


GCAATTCAAA 


CACAAGCAAA 


AGTTAAACCT 


4740 




GCAGCTGATA 


CGGAAGTAGA 


AAACGCATAT 


AATACACGTA 


AACAAGAAAT 


TCAAAATAGC 


4800 


50 


AATGCTTCAA 


CTACAGAAGA 


AAAACAAGCT 


GCATATACAG 


AATTAGATAC 


TAAAAAGCAA 


4860 




GAAGCAAGAA 


CAAATCTTGA 


TGCTGCAAAT 


ACAAACAGTG 


ATGTAACAAC 


AGCTAAAGAC 


4920 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 5040 

ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT TACTGCAAAC 5100 

GCTGATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 516 0 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATGTTAAAC CAGCAGCAAA ACAAGCAATT 522 0 

G GAG ATAAAG TACAAGCTCA AGAAACAGCA ATTGATGGAA ATAACGGCTC AACAACTGAA 52 80 

GAAAAAGCAG CTGCTAAACA ACAAGTTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA 534 0 

GATGCAGCAC ATACAAATGC GGAAGTTGAA GCGGCTAAAA AAGCAGCAAT TGCTAAAATT 5400 

GAAGCGATTC AGCCAGCAAC AACAACTAAA GATAATGCGA AAGAAGCAAT TGCTACGAAA 54 60 

GCGAATGAAC GTAAAACAGC AATCGCTCAA ACGCAAGACA TTACTGCTGA AGAAATTGCA 5520 

GCGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 5580 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACGACAGGTG AAAATAGTAT TGATCAAGTA 564 0 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCAAT TTTAAATAAC 5700 

AAATTGCAAG AGATTCAAGc tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 5760 

GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AACTACTAAC 5820 

GCACAAGTTG ATGAAGCTAA AGCAAATGCA GAAGCAGCGA TTAATGCGGT AACACCAAAA 58 80 

GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAGCAAC GCAAACAAAT 5 940 

30 GTTATCAATA ATGATCAGAA CGCTACAACA GAAGAAAAAG AAGCAGCTAT TCAACAATTA 6 000 

GCAACAGCAG TTACAGACGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6 0 60 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC AACAGCGGTT 6120 

AAATCAAATG CTAAAAATGA TGTTGATCAA GCTGTGACAA CTCAAAATCA AGCAATTGAT 6180 

AATAGAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 6 240 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACGCAAATT 63 00 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATACAACAAT TAAAGATGTT 6 3 60 

GCGAAAGATG AATTAG CAAC AAAAGCAAAC GAACAAAAAG CGCTTATTGC ACAAACTGCA 642 0 

GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACGCACA ATTAACACAA 64 8 0 

GGTAATCAAA A7ATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 6 54 0 

GCAATTCAAG CAATTGACCC AATTCAAGCA TCAACAGATG TTAAAACGAA TGCAAGAGCG 6 60 0 

GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 666 0 

AATGAAGAAA AAGGTAACGA TATTGGACCA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 6 72 0 
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AAAGTTCAAC 


AACTTCATGC 


AAATCCTGTT 


AAGAAACCAG 


CAGGTAAAAA 


AGAATTAGAT 


6840 




CAAGCTGCAG 


CTGATAAGAA 


AACACAAATA 


GAACAAACAC 


CAAATGCATC 


ACAACAAGAA 


6900 


5 


ATTAATGATG 


CAAAACAAGA 


AGTTGATACT 


GAATTAAATC 


AAGCGAAAAC 


AAATGTCGAT 


6960 




CAATCATCAA 


CAAATGAATA 


TGTTGATAAT 


GCAGTTAAAG 


AAGGAAAAGC 


TAAAATTAAT 


7020 




GCAGTTAAAA 


CATTTAGTGA 


GTACAAAAAA 


GATGCTTTAG 


CTAAAATTGA 


AGATGCATAT 


7080 


10 


AATGCTAAAG 


TAAACGAAGC 


GGATAACTCT 


AACGCATCGA 


CTTCAAGTGA 


AATTGCTGAA 


7140 




GCGAAACAAA 


AACTTGCTGA 


ATTAAAACAA 


ACTGCGGATC 


AAAATGTTAA 


TCAAGCTACT 


7200 


15 


TCTAAAGATG 


ACATTGAAGT 


TCAAATTCAT 


AATGACTTAG 


ATAATATTAA 


CGATTACACA 


7260 


ATTCCAACAG 


GTAAAAAAGA 


ATCAGCTACA 


ACAGATTTAT 


ATGCTTATGC 


AGATCAGAAG 


7320 




AAAAATAATA 


TTTCAGCTGA 


CACTAATGCA 


ACACAAGATG 


AAAAGCAACA 


AGCAATTAAG 


7380 


20 


CAAGTTGACC 


AAAATGTTCA 


AACTGCATTA 


GAAAGCATTA 


ATAATGGTGT 


GGATAATGGT 


7440 




GACGTTGATG 


ATGCATTAAC 


ACAAGGTAAA 


GCAGCAATTG 


ATGCTATTCA 


AGTAGATGCT 


7500 




ACTGTTAAAC 


CTAAAGCGAA 


CCAAGCTATT 


GAAGTTAAAG 


CAGAAGATAC 


GAAAGAATCT 


7560 


25 


ATTGATCAAA 


GTGACCAGTT 


AACTGCTGAA 


GAAAAAACTG 


AAGCATTAGC 


AATGATTAAA 


7620 




CAAATTACAG 


ATCAAGCTAA 


ACAAGGTATT 


ACTGATGCAA 


CAACAACTGC 


TGAAGTTGAA 


7680 




AAAGCGAAAg 


cTCaAGGACT 


TGAAGCATTT 


GATAACATTC 


AAATCGACTC 


AACAGAAAAA 


7740 


30 


CAAAAAGCTA 


TCGAAGAATT 


AGAAACTGCA 


CTAGACCAGA 


TTGAAGCAGG 


TGTAAATGTC 


7800 




AACGCTGATG 


CTACAACTGA 


AGAAAAAGAA 


GCGTTTACGA 


ATGCTTTAGA 


AGACATTTTA 


7860 




TCAAAAGCAA 


CTGaAGATAT 


TTCTGATCAA 


ACTACAAATG 


CAGAAATCGC 


TACTGT CAAA 


7920 


35 


AATAGTGCGC 


TTGAACAACT 


TAAAGCACAA 


CGTATTAATC 


CTGAAGTTAA 


GAAAAATGCT 


7980 




TTGGAAGCAA 


TCAGAGAAGT 


GGTTAACAAG 


CAAATAGGAA 


tAATTAAAAA 


TGCAGATG CA 


8040 


40 


GATGCATCGG 


CGGAAAGAnA 


TTGCACGTAC 


GGGATTTAGG 


TAGATATTTT 


GGACCGATTT 


81C0 


GCTGGATAAA 


TTTAGGGTnA 


AACCCCAACC 


AATGCCGAAG 


TTGCCTGAAT 


TACCA 


8155 




(2) INFORMATION FOR SEQ ID NO: 6 4 











(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 163 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

50 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT 12 0 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 18 0 

AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 24 0 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 3 00 

CCC-TTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA TCATCTTTAT 36 0 

CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 42 0 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 4 80 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 54 0 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 660 

AAACTTAAAG AAACGCATAA AAATACGCAA GACAAAGTCT TGCGTATCGA TAGAGTCCGT 720 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTUTTAT ATACAGGTGG GTGCCCTGTT 78 0 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 84 0 

CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 900 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 96 0 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 102 0 

GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 1080 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 114 0 

GCAATACCGC CATGTGGTGG TGCACCATAT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 12 0 0 

35 TCCTgTGCTT GTTCTTTAGT AAATCCAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 12 6 0 

TCATGAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 13 2 0 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AGCTTAGCAA TATCAGCTTC TTTTGGAGAT 13 80 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CAT CAT ATT C TAATAATGGC 14 4 0 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 15 00 

AATTTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 156 0 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 162 0 

CAAAGAAACG 16 30 
(2} INFORMATION FOR SEQ ID NO: 65: 
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(i) SEQUENCE CHARACTERISTICS: 

' a \ T.ENGTH- ~t~*? has** pa^'r^ 
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75 



(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

CAATTGGACA TCTTGTATGA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 6 0 

70 CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 12 0 

ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTT AT CAGAT GCGATTTCAA 18 0 

CTATATGACC TAAATACATA ACTCCAATGA CATCACTTAT ATGTTTTACT ACACTTAAAT 24 0 

CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAATCTTTT AATAAATTCA 300 

GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 36 0 

TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 42 0 

CATGTGCATA TTtATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 4 80 

CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 54 0 

TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG ATCTTGAAAT ATCATCTGAT 600 

ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 66 0 

CAATTATTGA GCCTGAAGTT GCATCTTCAA GCCTGATAAT CACTTTACCT AACGTTGACT 72 0 

TACCACAACC CG 73 2 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 583 8 base pairs 
35 (8) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

AATATATTCA TATGTTTCAT CAACAATATT AGCTGCTTTT TGAATTAAAG CAATTTCGTC 6 0 

AGCATCTTTG ACGTCTCTAA TTTTATCTAC AGTATTAGAA ATGCTTATTA ATGATATACG 120 

GCTTTTATTT AATTCAAGGT ATGTATCATA ACTTACATGA TGCCCCTCAA AACCTACATT 180 

TTCAAAATTT TCTTGGTGTA GCAATTCTTT AATCTCACCA ATAATAGTAG ATTTACGATT 24 0 

AATAATTTCA TAATTTGGCG CCTGCTTAGT TGCTTGATCA ATATATCTAA AGTCTGTTAT 30 0 

CAAATATTGT TTATCTTTAG ATATGATAAG TGCTCCACTG GTACCAGTAA AACCTGATAA 3 60 

ATATCTTCTA TTGTAATCCG AAAGAATGaT AATCGCATCT AAATGTTTTT GTTCTAAAAT 42 0 



55 



CAACTTTATA CATTAAAATA ATATCATAAT 
GGGAGATAGT AATGAAAAAA TTGGTTTCAA 
5 GTGGATCACA AAATTTAGCA CCATTAGAAG 

ATCAACTCAA ACTAGATATT CAAGAACTTA 
TTAAAGGGCT TGAAAAGGAT AAAGAAAACA 

ic 

TTAAATTGAT GAATGTTACA TCAACATACT 
ATAACGATAT TGAGAAAGAT GTAAGTAAAA 
TAAATCAAAT TTCTAATGAT ATTCAAAGTG 

15 

GTTTATCACT TAGTGATGAT GATAAAAAAA 
ATTTGAATCA TGCATTTGAT GATATTAAAA 

20 TTACAAAAGG ACAACAAGCG TTGTCAAAAT 

CTTTTAATG C GTAATATAAT ATTTTATCTT 
ATGAATCTAG ATGCCTTTAT TTTTTCAATC 

25 CTGGTATTAT TTATCTGATT TATTATTT CT 
GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT 
CACTTGTAGG TAAAATAGTC TACGTGCTTC 
CATCCATTCA TCTGAACGAT ATTTTTCAGT 
TGtTAATTCA AGTGG CTTTA ATTCTATATT 
TTCTTCCATT TGACTAATAG TAATGTGTTC 

35 

TTCAACGAAT GCCTCTTTCA TTTTTAATTT 
AAACAGTTCA TCAATATCAA TATCTTGTAA 
TTGTCTCGTT TGAGCACTCC CAGCAATCTT 

40 

TGGTGCATCA AAACACACTG AACTTCGAGG 
TTTAGGTACC GCAAAATAAG TATCAAATCC 

45 TGAAATCACT CTGTACGCTT CTGTAACTGT 
AATCACACTG TAAGTTAACT CTTTATCATG 
TACGAGACCA AAACCTTTCT CTTTAACCTT 

50 GAAATACCCT ATTGATAATG TTGCAGGATT 
TTCACCTCTA GAGA C AAAAT TTAATAACGC 
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AAGGATAAAA AATAATAGAT ATTGATTTTA 54 0 

TTGTTGGCGC AACATTATTG TTAGCTGGAT 600 

AnAAAACAAC AGATTTAAGA GAAGATAATC 660 

ATCAACAAAT TAGTGATTCT AAATCTAAAA 720 

GTAAAAAAAC TGCATCTAAT AATACGAAAA 7 80 

ACGACAAAGT TGCTAAAGCT TTGAAATCCT 84 0 

ACAAAGGCGA TAAGAATGTT CAATCGAAAT 900 

CTCACACTTC ATACAAAGAT GCTATCGATG 960 

CGTCTAAAAA TATCGATAAA TTAAACTCTG 1020 

ATGGCTATCA AAATAAAGAT AAAAAACAAC 108 0 

TAAACTTAAA TGCAAAATCA TGATAGGAGT 114 0 

GTACTTATTA TTGCTGCGAT TGGATTAGTA 12 00 

GTCAGAATGT TAATCAGCTT TGcgTAaTAG 126 0 

TCATCTTAAC TGAAGACCAA CGCAAATATC 132 0 

CAAAGAAGAA AATAGATAAA AAAACGGAAG 13 80 

CATTTTTTAT TCTAAAAACT ACTTTCTAAA 14 4 0 

TAATTCTTCC ACTTCTGCCA ATTGAGCTTC 15 00 

TAAACCTTTC TTAAAACCTT TCTCGAAAGC 1560 

ATCTGAAATA TCATTGATGG CAACTGCTTT 16 20 

TAATCTTTCA TTTTTATAAA TrAACATATC 16 80 

AATCGAACCG TGTTGGAGGA TTACGCCCTT 174 0 

ACGGCCTTCA ACAACTAGCT CATACCAACT 180 0 

TTGTTTTAAT TTTTGACGCT CTTCAGGCGT 1860 

TAAGTTTTTA AATCCTTCTA ATAATCCTTG 1920 

AGAAGGCATA TTCGGATGCG ATTCAGGCAC 19 30 

TAGCACCCCA CGGCCACCAG TTTGACGCCT 2 04 0 

ATCAATATCA ATTTCTTTTT GTAGCCTTTG 2100 

CCATGTGTAA AAACGTATAA CTGGATCAAT 2160 
TTCATCCATT GOCATATTAT AATATGGOT^ 
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AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 23 4 0 

TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 24 0 0 

TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 246 0 

TATTGCTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2 5 20 

AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAGTTGA 25 8 0 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGGCAACG 264 0 

ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTGCTAG 27 0 0 

CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 2 76 0 

CGGCTATAAA AAATGGACTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 282 0 

TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 28 8 0 

20 TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 294 0 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 300 0 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 306 0 

TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 312 0 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 3 24 0 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 33 00 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 33 6 0 

TCAATCATCA TACCTTCTTC AACATTTAAT GGG AAGTATA TTGTTGGTGG ATGTACACCG 3 42 0 

AAATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 34 80 

CCACTTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 3 54 0 

AAACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 36 0 0 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 3 6 60 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 3 72 0 

45 TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC AC CGACTGGA 3 780 

CCTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATGCAA GTTTAAATGA 3 84 0 

ACAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 9 00 

50 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3 960 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 4 02 0 
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GATTTAAATC CTGCAAATGa AGCTGAGGCT GGaTTCGTAC CATGCGCAGA ATCTGG cACA 414 0 

ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TATCATCAAT 42 00 

GCAGTCCATT CACCATGTGC GCCAGCAGCT GGTTGTAATG TCACCTCATC CATACCAGTA 42 60 

ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 4 320 

TCATCTTGTA ATGGATGTGA TTCACTAAAT CCTGGTATTC TAGCAACCTT TTCATTAATT 43 80 

TTAGGGTTAT ACTTCATCGT ACATGAACCC AATGGATAAA ATCCGTTGTC TACACCGAAA 4440 

TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 4 50 0 

TCCGCTTTGT TTTTACGAAT AAATTTATCA TCTAACAATG ACTCAACAGA ATTTGTTTTA 4560 

ATATCACTTT TTGGTAATGA ATATGCATAT CTGCCTTCAC GAGATCTTTC AAAAATTAAT 4620 

GGACTTGATT TACTAGTCAT TTAACTCACC AGCCTTTTCT ACAAATGTAT CGATTTCATC 46 8 0 

TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTCGT CTGAAACAAC 4 74 0 

ACCTAAATCA AAACCACCGA TAATATTGTA CTTCACTAAT TCCTCGTTAA CTTGTTGAAT 4 800 

TGGTTTGTCA AATTTGACTA CAAACTCATT GmnAAGnTGT ACCATCTAAT ACTTCAAAAC 4 86 0 

25 CTTTTTTAAT AAATTGTTGT TTAGCATAGT TAGCATGTTC TATATTTTGA ACTGCAATAT 4 920 

CATAGATACC TTGTTTACCA AGTGCTGACA TTGCAATTGA TGaCGcTAAA GCATTTAATG 4 980 

CTTGGTTAGA ACAAATATTA GATGTCGCTT TATCGCGTCG AATATGTTGT TCACGTGCTT 504 0 

GTAATGTTAA TACAAAGCCA CGATTACCTT CATCATCTTG TGTTTGACCG ACTAATCTAC 5100 

CTGGCACTTT ACGCATTAAC TTTTTCGTCG TTGCAAAATA TCCACAATGT GGCCCACCGA 516 0 

ATTGAGCAGG AATTCCGAAT GGCTGAGTAT CACCTACAAC AATATCTGCA CCAAATGAAC 522 0 

CTGGAGGTGT AAGTAATCCC AATGCTAATG GATTTGCATA TACGATAAAT AATGCTTTTT 52 80 

TATCFTCAAT AAAGCTATGA ATCTTTTCAA GATCTTCAAT TGAACCGTAA AAGTTTGGAT 534 0 

ATTGTACTGC AACAGCTGCT GTTTCATCAT CCACTGCTGC TTCTAATTTT TTCAAATCTG 5400 

TAACAGTGCC ATCTAAATCG ATTTCCACTA CTTCGAATTC CTTACGCGTC TTAGCATAAG 54 6 0 

TATGAAGTAC TTGTAATGCT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT 552 0 

TTGTTTGACT AAATGCTAAG ATACATGCTT CAGCAAAGCT AGT CATC CCA T CAT A CAT AG 55 8 0 

AAGAATTTGC TACATCCATA TCTGTTAATT CACAAATTAA AGTTTGGAAC TCAAAAATGG 5 64 0 

CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATGGCGT ATATGCTGTG TAAAATTCTG 57 0 0 

ATCTTGAAAT CATAGCATCC ACAACTGATG GCGCGTAATG ATCATAAACA CCAGCACCCA 5 76 0 

rAAATGATGT ATGCGTTTCT TTAGTGATAT tCTTGCTkGC AATGGGGATT TAAACnTCTA ~ Q / n 



30 



35 



40 



50 



EP0 786 519 A2 



(2) INFORMATION FOR SEQ ID NO: 67; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



w 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 





ATnATAATTG 


GCTTTGCTAA 


TAATTACTTC 


CCTGAATTAC 


aAGTATTAGC 


AAACGAAATA 


60 


15 


AAATCTGATA 


TGGCTAGTTC 


ATTAAAACAA 


TGATATTTTT 


ATTTAAATTT 


TTaAAGCTTT 


120 




GTACGAAATT 


GTACAAAGCT 


TTTTTGGTGC 


GTATTGTATG 


GGCAACAACT 


TGACGATGAA 


180 




AATCCGTTAC 


AGGATTGGTA 


ATAGGAAATG 


TTAGCGAAAG 


ACAAGGGTAT 


CCATTGTAGA 


240 


20 


TTAACAAAAG 


GACGTTTCCA 


CAAGTGTGGG 


TTATTCTCAC 


TAAAGCAATA 


CGCAGAGACA 


300 




ACTTACGTAA 


AATTTTGAAC 


TGACTAGAAC 


GGAACTTCTA 


CTCAATTATT 


GATAAAAATT 


360 




TTCAAAAAGA 


CTTGAATGTG 


CTGAGAATAC 


GAAGTTTATG 


GAAGGATTAT 


CAAAATATAA 


420 


25 


ATGTGCATTC 


ATTTACAACC 


TTTATTGACA 


ATGATTCTCA 


ACTAATATAG 


TATATAATCA 


480 




AATCGTAATA 


GTTACGATTT 


GTTTTCTGCA 


ACTTTTTTGA 


AGTTTTAGTT 


GAGGTGAAAA 


540 


30 


CAATAAAAGC 


ATCTAAGTGA 


ATGTAGTTAA 


CGGACAACTG 


CATTCGCTTG 


TAGAGCCACA 


600 


AGAAGCAACT 


TTAAATAAGG 


TTTACGGTTG 


CATTTTGATA 


CAACAACCGA 


TTACTAAGTC 


660 




ATGCTTTCCA 


CTTTGCGGGT 


TAGCATGACT 


TACCTAATAG 


ATAGAGCTAT 


TAGGTTCAGC 


720 


35 


TTCTAAAAAA 


TTACAGTTTT 


AGAGGAATAC 


AGTTGcTTGc 


tTCGCAACAA 


CTGCATAAGA 


780 


GCCATGGTTT 


TCGCTTTTGC 


GAATTAGCAT 


GACTTACCTA 


CTAGATAGAG 


CTATTAGGTT 


840 




CATCTTCTAA 


AAAATTACAG 


GTTTAGAGGA 


ATACAGTTGT 


TTGcTTCGCA 


ACAACTGCAT 


900 


40 


AAGAGCCTCT 


AGTAATTAAA 


ATTACAGAGG 


CTCTAAAAAT 


ACATCTAAAG 


GAGTGTCGTA 


960 




TGAATCGGCA 


GGTTATAGAA 


TTTTCTAAGT 


ATAATCCTTC 


GGGGAATATG 


ACGATACTTG 


1020 




TTCATTCAAA 


ACATGATGCT 


AGTGAATATG 


CATCTATCGC 


CAATCAGTTG 


ATGGCCGCAA 


1080 


45 


CACATGTATG 


CTGTGAACAG 


GTAGGCTTTA 


TAGrATCAAC 


ACAAAATGAT 


GATGGTAATG 


1140 




ATTTTCACTT 


AGTTATGAGC 


GGTAATGAAT 


TTTGCGGTAA 


TGCGACGATG 


TCATATATAC 


1200 




ATCATTTGCA 


GGAAAGTCAT 


TTGCTTAAAG 


ACCAACAGTT 


TAAGGTGAAG 


GTGTCTGGCT 


1260 


50 


GTTCGGATTT 


AGTGCAATGC 


GCAATTCATG 


ATTGCCAATA 


CTATGAAGTT 


CAAATGCCAC 


1320 




AAGCCCATCG 


TGTTGTGCCA 


ACAACAATTA 


ATATGGGTAA 


TCATTCATGG 


AAAGCAATAG 


1380 
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TTCAACATTT GGTTGAAGCG TTTGTGCGTG AgcAACAATG GAGTCACAAA TATAAAACAG 1500 

TAGGTATGAT GCTTTTTGAT GAACAACGTC AATTTTTACA GCCATTAATC TATATACCAG 156 0 

AAATTCAAAG TTTAATTTGG GAAAATAGCT GTGGTTCTGG TACAgcATCA ATTGGGGTTT 162 0 

TTAATAATTA TCAACGTAAT GACGCATGCA AAGATTTTAC AGTACATCAG CCAGGGGGCA 168 0 

GTATTTTAGT GACATCAAAG CGATGTCATC AATTGGGATA TCAAACTTCA ATTAAAGGAC 174 0 

AGGTTACAAC TGTAGCTACA GGaAAAGCAT ATATAGAATA AGGAGCCTAC AATGAATAAC 180 0 

TTTAATAATG AAATCAAATT GATATTACAA CAATATTTAG AAAAGTTTGA AGCGCATTAC 186 0 

GAGCGTGTAT TACAAGACGA TCAATATATC GAAGCATTAG AAACATTGAT GGATGACTAT 1920 

AGTGAATTTA TTTTAAATCC TATTTATGAA CAACAATTTA ATGCTTGGCG TGACGTTGAA 1980 

GAAAAAGCAC AATTaATAAA ATCACTGCAA TATATTACAG CGCAGTGTGT TAAACAAGTG 2 04 0 

GAAGTCATTA GAGCGAGACG TCTATTAGAC GGACAGGCGT CTACCACAGG TTACTTTGAC 2100 

20 

AATATAGAAC ATTGTATTGA TGAAGAGTTT GGACAATGTA GTATAG CTAG CAATGACAAA 216 0 

TTATTGTTAG TTGGTTCAGG TGCATATCCA ATGACGTTAA TTCAAGTAGC AAAAGAAACA 2220 

25 GGTGCTTCAG TTATCGGTAT TGATATTGAT CCACAAGCCG TTGACCTAGG GCGCAGAATC 228 0 

GTTAACGTCT TAGCACCAAA TGAAGATATA ACAATTACGG ATCAAAAGGT ATCTGAACTT 234 0 

AAAGATATCA AAGATGTGAC GCATATCATA TTCAGCTCGA CAATTCCTTT AAAGTACAGC 24 00 

30 ATTTTAGAAG AATTATATGA TTTAACAAAT GAAAATGTCG TAGTTGCAAT GCGCTTTGGT 24 6 0 

GATGGCATCA AAGCAATATT TAATTATCCG TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 2 52 0 

TGTGTGAATA AACATATGAG ACCACAGCAA ATTTTTGATA TAGCACTTTA TAAAAAAGCA 2580 

GCTATAAAGG TAGGTATTAC GGATGTCTAA ATTATTAATG ATAGGCACTG GTCCgGTCGC 264 0 

AATGCAATTA GCGAATATTT GCTATTTAAA ATCAGATTAT GAGATTGATA TGGTTGGACG 2 70 0 

TGCCTCAACA TCAGAAAAAT CAAAACGCTT ATATCAAGCG TATAAAAAAG AGAAACAATT 276 0 

TGAAGTCAAA ATACAAAACG AGGCGCATCA ACATCTGGAA GGTAAGTTTG AAATTAATCG 2 82 0 

TTTGTATAAA GATGTTAAAA ACGTTAAGGG TGAATACGAA ACGGTTGTCA TGGCATGCAC 2 88 0 

45 AGCAGATGCT TATTATGACA CACTACAGCA ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 2 94 0 

ACATGTCATT TTAATATCAC CGACATTTGG TTCGCAAATG ATTGTCGAAC AATTTATGTC 3 0 00 

TAAATTTAAT AAAGATATCG AAGTGATTTC ATTCTCAACT TATCTTGGCG ATACACGTAT 3 06 0 

50 TGTTGATAAA GAAGCGCCTA ATCATGTGTT GACAACAGGT GTAAAAAAGA AATTGTACAT 312 0 

GGGATCGACA CATTCAAACT CAACAATGTG TCAACGAATC TCTGCTTTAG rTGAGCAAT~ 1 9 0 



35 



40 



TTATGTGCAC CCACCACTAT TTATGAATGA 
AGATGTACCG GTTTATGTGT ATAAGTTATT 
5 CCGTGAAATG CGTTTAATGT GGAAGGAAAT 

GTCAGTCAAC CTGCTTCAAT TTATGGTGAA 
GGATGAAGGT GATATTGAGC ATTTCGAAAT 

10 

TTATGTAAGA TATACCGCAA TCCTCATTGA 
TTACTTTGAT TTTTCAGCTG TACCATTTAA 

15 TCAAATTCCA AGAATGCCAA GTGAAGATTA 

GAAAATGCTA GGTATCAAAA CGCCAATGAT 
TTGCCAGGCG TACAAGGATA TG CATCAAGA 

20 TCTATTTGAA GGAGATAAAG CACTCGTCAC 

ATAATAAGGG TTTGAAGTTT TATAATAGAA 
ATAAAAATAA GCAAATAATT GAGAAAAATA 

25 TATCAATTTA GAAAGAGGAA AAGCAAATGA 

TTGCATCAGG GCTAATTTTA ACTGGTTGTG 
AAAACAAGCA ATTAACGTAT ACGACGGTTA 

30 

ACGGTGGATC AATGTCTGCT GAAAGTATGA 
ATGGTATTAA GCCTTTACTA GCTAAAAAGT 
CGTTCCATTT GAGAGATGAC GTTAAATTCC 

35 

GTTAAGAAAA ATATTGACGC AgTTCAAGAA 
TCGACATTAA TTGACAATGT TAAAGTTAAA 

40 GAAGCATATC AACCTGCATT GGCTGAATTA 

CCAAAAGACT TTaAAAACGG TACAAcAAAA 
CCATTTAAAT TAGGTGAACA CAAAAAAGAT 

45 TACTGGGGCG AAAAGTCTAA ACTTAACAAA 

ACAGCATTCC TATCAATGAA AAAAGGTGAA 
ACAGATAGCT TAGACAAAGA CTCTTTAAAA 

50 

AAGCGTAGTC AACCTATGAA TACGAAAATG 
GCTGTGAGTG ACAAAACAGT CAGACAAGCG 
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CTTTTCATTG AAAGCCATTT TCGAAGGAAC 3 3 00 

TCCTGAAGGA CCGATAACGA TGACACTAAT 3 36 0 

GATGGTTATT TTACAAGCAT TTAGAGTGCC 34 20 

GGAAAATTAT CCAGTACGTC CTGAAACTTT 34 80 

CTTGCCAGAT ATCTTACAAG AATATCTGCT 3 54 0 

TCCATTTTCA CAGCCAGACG AAAACGGACA 3 6 00 

GCAAGTCTAT AAAAATGAAC AGGATGTTGT 3 660 

TTACAGAACG GCGATGATTC AGCATATTGG 3 720 

TGATCAGTTC CTAACTCGCT ATGAAG CAAG 37 80 

TCAACACTTA TCTTCTCAAT TTAATACAAA 3 84 0 

AAAATTTTTG GAAATCAATA GAACGCTTTC 3 900 

AAAAATTATT GAATTATGTT TGACATTTAC 3 96 0 

ATCATTACGA TTTGATTAAG TAATGCAACT 4 020 

GAAAACTAAC TAAAATGAGT GCAATGTTAC 4 0 80 

GCGGTAATAA AGGTTTAGAG GAGAAAAAAG 414 0 

AAGATATCGG TGATATGAAT CCGCATGTTT 4 2 00 

TATACGAGCC GCTTGTACGT AACACGAAAG 42 6 0 

GGGATGTGTC TGAAGATGGG AAGACATACA 432 0 

ATGATGGTAC GCCATTTGca TGctGACGCA 43 8 0 

AACAAAAAAT TGCATTCTTG GTTAAAGATT 4 44 0 

GATAAGTACA CGGTTGAATT GAATTTGAAA 4 500 

GCGATGCCTC GTCCATATGT ATTTGTGTCT 4560 

GATGGCGTTA AAAAGTTCGA TGGTACTGGT 4 62 0 

GAGTCTGCAG ACTTTAACAA AAATGAT CAA 46 80 

GTACAAGCAA AAGTAATGCC TGCTGGTGAA 4 74 0 

ACGAACTTTG CCTTCACAGA TGATAGAGGT 4 80 0 

CAATTGAAAG ATACAGGTGA CTATCAAGTT 4 860 

TTAGTTGTCA ATTCTGGTAA AAAAGATAAC 4 92 0 

ATTGGT CATA TGGTAAACAG AGATAAAATT 4 98 0 
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ACAGACATTA ATTTCGATAT GCCAACACGT AAGTATGACC TTAAAAAAGC AGAATCATTA 510 0 

TTAGATGAAG CTGGTTGGAA GAAAGGTAAA GACAGCGATG TTCGTCAAAA AGATGGTAAA 516 0 

AACCTTGAAA TGGCAATGTA CTATGACAAA GGTTCTTCAA GTCAAAAAGA ACAAGCAGAA 522 0 

TACTTACAAG CAGAATTTAA GAAAATGGGT ATTAAGTTAA ACATCAATGG CGAAACATCA 528 0 

GATAAAATTG CTGAACGTCG TACTTCTGGT GATTATGACT TAATGTTCAA CCAAACTTGG 5 34 0 

GGATTATTGT ACGATCCACA AAGTACTATT GCAGCATTTA AAGAGAAAAA TGGTTATGAA 54 0 0 

AGTGCAACAT CAGGCATTGA GAACAAAGAT AAAATATACA ACAGCATTGA TGACGCATTT 54 6 0 

AAAATCCAAA ACGGTAAAGA GCGTTCAGAC GCTTATAAAA ACATTTTGAA ACAAATTGAT 552 0 

GATGAAGGTA TCTTTATCCC TATTTCACAC GGTAGTATGA CAGTTGTTGC ACCaAAAGAT 558 0 

TTAGAAAAAG TATCATTCAC ACAATCACAG TATGAATTAC CATTCAATGA AATGCAGTAT 564 0 

AAATAAAGGA GCAATTAGAT GTTCAAATTT ATCTTAAAAC GTATTGCGCT CATGTTTCCA 5700 

20 

TTGATGATTG TAGTAAGTTT TATGACATTT CTATTGACGT ATATTACAAA TGAAAATCCA 5 760 

GCTGTGACAA TTTTACATGC ACAAGGGACG CCAAATGTAA CACCAGAGTT GATTGCAGAA 5820 

25 ACGAATGAGA AGTACGGTTT CAATGATCCA TTATTAATTC AATATAAAAA TTGGTTACTT 5880 

GAAGCGATGC AATTTAATTT TGGTACAAGC TACATTACAG GTGACCCAGT TGCTGAACGT 5 94 0 

ATTGGTCCAG CATTTATGAA TACATTGAAA TTAACAATAA TTTCAAGTGT TATGGTGATG 600 0 

30 ATTACATCAA TTATTTTAGG TGTAGTTAGT GCATTAAAAA GAGGAAAGTT CACTGATCGT 6 060 

GCGATACGTT CAGTGGCTTT CTTTCTAACT GCATTACCAT CATATTGGAT AGCTTCAATA 612 0 

CTTATTATTT ACGTTTCAGT GAAGTTAAAC ATATTGCCGA CTTCTGGATT AACAGGTCCA 618 0 

GAAAGTTACA TATTGCCAGT GATCGTTATT ACGATTGCCT ATGCTGGTAT TTACTTTAGA 6240 

AATGTTAGAC GCTCGATGGT GGAACAATTA AATGAAGATT ATGTACTTTA TTTAAGAGCA 63 00 

AGCGGTGTGA AATCTATCAC ATTAATGTTG CATGTGTTGC GTAATGCTTT ACAAGTTGCG 6 3 60 

GTATCAATCT TTTGTATGTC TATACCAATG ATAATGGGTG GACTAGTTGT TATCGAGTAT 64 2 0 

ATCTTTGCAT GGCCTGGACT AGGTCAATTA AGTTTAAAAG CAATACTTGA ACACGATTTT 64 30 

j. CCAGTCATTC AAGCATATGT ATTAATTGTA GCGGTATTAT TTATTGTATT TAATACATTA 6 54 0 

GCAGATATCA TTAATGCGCT ATTAAATCCA AGATTAAGGG aGGGCGCACG ATGATAATTT 6 6 00 

TAAAmCGATT ATTmCArGwT AAAGGTGCAG TAATTGCTTT AGGCATTATT GTATTATATG 66 6 0 

50 TCTTTTTAGG ATTAGCAGCA CCACTTGTGA CATTTTATGA TCCTAACCAT ATCGATACAG 6 72 0 

CAAACAAA7T TGCTGGCATG AGTTTTCAAC ATCTACTAGG TACTGACCAT TTAGGTAGAG 67 3 0 
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TATTTGTTTC TGTACTTATT GGATCTATTT 
TTGTTGACGC CTTAATCATG CGTGCGTGTG 
5 TAACGTTAGC ATTAATTGCA TTGTTTGGAA 

TTTTGACGCG TTGGGCATGG TTCTGTCGTG 
CTTCTGACCA TGTAAGATTT GCTAAAACAA 

10 

AACATATTAT GCCATTAACA TTAGCAGATA 
CAATGATCTT GCAAATATCT GGCTTTTCAT 
CAGAGTGGGG CATGATGCTT AACGAaGCTA 

15 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA 
CTTTACAAAT TGCTATTGAT CCCCGCATCT 

2Q AAGGAGTGGT GCAATCATGA CATTGTTAAC 

GACAGATCAA CCACTCGTGA GTGATGTGAA 
CGTTATTGGA GAAAGTGGTA GTGGTAAATC 

25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT 

ATCTGAATCG CAATTGAAAA AGTACCGTGG 
TAGTCGTGCC TTTGACCCAT CAACTACTGT 

30 ACATACGTCA ATGTCTACAC AAGAAATTGA 

AAGTTTGAAA GATCCTAAAC G TAT ATT AAA 
GTTACAGCGA TTGATGATTG CTTTAGCGTT 

35 

TGAGCCGACA ACGGCTTTAG ATACAATTAC 
TATT^AAAAA CACTTTGACT GTGCGATGAT 
CAAGATTGCA GACCGTGTTG TTGTGATGAA 

40 

TGAATCAGTC TTG CAT CATC CAGAACATGT 
GAAGATTAAT GATCATTTTA AACATGTGAT 
ATGTTGAAAA GTCATATCAA AGCGCACATG 

45 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG 
GCGGTAAATC GACGTTGAGT CktATGATAT 
50 TAACCTTAAA TGATCAACCG ATGCATAAGA 

TATTTCAAGA TTATACGTCA T CATTAC AT C 



TAGGATTCTT ATCAGGATAT TTCCAAGGGT 6 900 

ATGTTATGTT GGCATTCCCA AGTTATGTTG 6 96 0 

TGGGTGCCGA AAATATTATC ATGGCATTTA 702 0 

TTATACGTAC AAGTGTTATG CAGTACACTG 708 0 

TCGGTATGAA TGATATGAAA ATTATTCACA 714 0 

TTG CT AT CAT CTCTAGTAGC TCGATGTGTT 72 0 0 

TTTTAGGATT AGGTGTCAAA GCGCCTACTG 72 6 0 

GAAAAGTGAT GTTTACACAT CCTGAAATGA 73 2 0 

TAGTGATGGC ATTTAACTTC TTATCCGATG 73 8 0 

CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 744 0 

AGTTAAACAT TTGACGATTA CAGATACCTG 750 0 

TTTTACATTA ACTAAGGGTG AAaCTTTAGG 7560 

AATCACTTGT AAATCGATTA TTGGTTTGAA 7620 

TATCTTTGAT GGTACAtCAA TGTTGTCATT 76 8 0 

TAAAGACATT GCGATGGTCA TGCAACAAGG 774 0 

CGGTAAACAA ATGTTTGAGA CTATGAAAGT 780 0 

AAAGACATTG ATTGAATATA TGGATTATTT 78 6 0 

ATCATACCCT TACATGTTAT CAGGAGGAAT 792 0 

AgcTTTgAAA CCAAAGTTAA TCATTGCTGA 7 98 0 

ACAATATGAT GTACTGGAAG CATTTATAGA 8 04 0 

TTTCATTTCA CATGATTTAA CGGTTATTAA 8100 

AAATGGTCAG CTTATTGAAC AAGGGACACG 816 0 

TTATACGArt ATTkTATTAT CAACGAAGAA 822 0 

GAGGGGTGAT GTACATGATT AAAATTAAAG 82 8 0 

TTTTTAAGCG TCGTCGAACA CCTATCGTGA 834 0 

CGACGATTGC GATTATCGGA GAAAGTGGTA 84 00 

TAGGTATTGA GAAACCGGAT AAAGGTTGTG 84 6 0 

AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 8 52 0 

CATTTCAGAC TGTTAGAGAA ATCTTATTTG 85 8 0 
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TGTTGGAAGA AGTCGGTCTA TCTAAGGCAT ACATGGATAA ATATCCTAAT ATGTTATCAG 87 00 

GTGGAGAGGC GCAACGTGTT GCGATTGCGC GTGCAATATG TATTAACCCT AAATATATTT 876 0 

TGTTTGATGA AGCCATTAGT TCACTCGACA TGTCAATTCA AACACAAATA TTAGATTTAT 8 82 0 

TGATTCATTT ACGTGAAACG CGTCAGTTGA GTTATATTTT TATCACACAT GATATTCAAG 8 880 

CTGCCACGTA TTTATGTGAT CAATTAATTA TTTTTAAAAA CGGAAAAATA GAAGAACAAA 8 94 0 

TTCCGACAAG CGCATTGCAT AAAAGTGACA ATGCTTATAC AAGAGAATTA ATAGAAAAAC 9000 

AACTATCATT CTAAGGAGTG AGATAATGAA AGGTGCAATG GCTTGGCCCT TTTTGAGATT 906 0 

ATATATATTA ACATTGATGT TCTTTAGTGC CAATGCAATC TTAAACGTGT TTATACCTTT 912 0 

ACGAGGGCAT GATTTAGGCG CAACGAATAC GGTTATCGGT ATCGTTATGG GGGCATACAT 9180 

GTTAACAGCA ATGGTATTTC GACCATGGGC AGGACAAATT ATTGCTCGTG TCGGTCCCAT 924 0 

TAAAGTATTA AGAATTATTT TGATTATCAA TGCCATAGCT TTAATTATTT ATGGTTTTAC 93 00 

TGGCTTAGAA GGTTATTTCG TAGCACGTGT TATGCAAGGT GTGTGTACGG CATTCTTTTC 936 0 

TATGTCTTTA CAGCTAGGTA TTATTGATGC ATTACCAGAG GAACATCGTT CTGAAGGTGT 94 2 0 

25 ATCATTGTAC TCGCTATTTT CAACGATTCC AAACTTAATC GGACCATTAG TTGCCGTAGG 94 8 0 

TATTTGGAAT GCAAATAATA TTTCACTATT TGCAATTGTC ATTATCTTTA TCGCATTAAC 954 0 

AACAACATTC TTTGsTATCG CGTGACCTTT GCTGAACAGG AACCCGATAC GTCAGATAAG 9600 

ATTGAAAAAA TGCCGTTTAA CGCTGTAACT GTTTTTGCGC AATTTTTCAA AAATAAAGAG 96 60 

TTGTTGAACA GTGGTATTAT CATGATTGTT GCATCGATTG TATTTGGTGC AGTTAGTACA 972 0 

TTTGTACCGT TATACACAGT GAGTTTAGGA TTCGCGAATG CGGGAATCTT TTTGACAATA 97 8 0 

CAGGCCATCG CAGTTGTTGC GGCAAGATTT TACTTAAGGA AATACATTCC GTCAGATGGT 9 84 0 

ATGTCGCATC CTAAATATAT GGTATCTGTA CTATCATTAT TAGTAATCGC GTCATTTGTA 9900 

GTGGCATTTG GTCCGCAAGT AGGTGCAATT ATTTTCTATG GTAGTGCGAT ATTAATAGGA 99 60 

ATGACGCAAG CAATGGTGTA CCCAACATTA ACATCATACT TAAGCTTCGT CTTACCAAAA 10020 

GTAGGTCGTA ATATGTTGTT AGGTTTATTT ATTGCCTGTG CAGACTTAGG TATATCGTTA 100 80 

^ GGTGGCGCAT TGATGGGACC TATTTCCGAT TTAGTAGGAT TTAAATGGAT GTATCTAATT 1014 0 

TGTGGTATGT TAGTCATTGT AATAATGATT ATGAGTTTCT TGAAAAAGCC AACACCACGT 10200 

CCAGCGAGTA GTCTTTAATG AAGTGAATTA AAGCATATTA AGTTAATGAA TATTTAAATT 10260 

50 TTAAAAGGTA TATTGaGCAT GGCGATTCAT GTGCTTCATG CTAGGACATG AAACATTCTA 10320 

TATGGCTCGT TTTTAGAACG ACAtATATCT AAATAAAGCA CGCTTArAAG TGAGTTTTGA 10380 
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TTACATGAAA 


ATATGCAAAA 


CGAGTATAAC 


TGCTAATTGA 


TAGAAATAGC 


TCACCATAAA 


10500 




ATTACGGTAT 


GATTTTAAAT 


ATAAGTAAGT 


CGCACTACCT 


GCTAGTATCA 


ATGCTGGAAT 


10560 


5 


GAATTCCCAC 


CATGTATTAA 


TGTATGGATA 


GTAGAACAGA 


GTTTCAAGGA 


TAATGGACAA 


10620 




TACTATTGTA 


ATCTTTAAAG 


GTATTAATCT 


GCTTAATTCT 


TGAATTAAAA 


TATGACGGAA 


10680 


10 


AATAAGTTGA 


CAAATCAAAG 


TATTTAATAT 


AATGGTTAAC 


GAAAATATAG 


CTATTAAACT 


10740 


GATGGAaCCA 


TACCCTTTAA 


TGAGCGGGTA 


AATGTCAAAG 


ACAGTAAAGG 


AATCTACATT 


10800 




TAGTGCGAAA 


ATATTGAAAT 


GATTTAAAAG 


TAAAAAGAGT 


ACGACACTTA 


GTGTAAATGA 


10860 


15 


TATAAGAATA 


TGCCATTTAT 


ATTTAGCACT 


AGCAACGATT 


TGCGAACGTA 


TCATTGGAAT 


10920 




AAACGCATCT 


TCATGCATCA 


GACGAAAAAT 


AGCTAGTGAA 


ATAATAACTG 


CGAGTAAATA 


10980 




GCTAATGTTC 


ATTGAAATAG 


GAAAAGAGAA 


ACCCCACGGA 


GCTTGTTGAG 


TGAATACAGC 


11040 


20 


TACT AAC CCA 


AAAGTTAAAA 


AGACGATAAT 


GATCGGCAAG 


ATGTTAACCA 


AAAATATGTA 


11100 




AAGGAAAATA 


AATCCAATAT 


CACGTTTGAA 


AAAACGCGAT 


TGTTCGGTAG 


CGTATTCTTC 


11160 




TTCTATGTAA 


TGTTTATTTG 


TATTTGACAT 


AGTATACCTC 


TTAAATAGTT 


GTATTATATA 


11220 


25 


GATACTTTAG 


CACATATTAC 


TTTGTATTGT 


ATGTTTTATA 


CATTAAAATT 


TAAAATGAAA 


11280 




AA CAT AT CAT 


AAAATTGTTT 


TATAAAATGA 


AGCGCTTCCA 


TTGTGTTTTG 


TTTTGTAAGG 


11340 




TGTATCATAA 


ATATTGAATT 


GAAATTTTGG 


GGGGAGGTAT 


TGTAATGACG 


TTTCTTACAG 


11400 


30 


TCATGCAATT 


TATAGTTAAC 


ATTATCGTTG 


TAGGATTCAT 


GCTTACGGTT 


ATTGTTATCG 


11460 




GGCTTATTTG 


GTTAATTAAA 


GATAAAAGAC 


AATCACAACA 


TAGTGTATTA 


AGGAATTATC 


11520 


35 


CTTTACTAGC 


ACGTATTAGA 


TATATTTCAG 


AAAAAATGGG 


ACCGGAATTA 


CGTCAGTATT 


11580 


TATTTTCTGG 


GGATAATGAA 


GGGAAACCTT 


TTTCACGTAA 


TGATTATAAA 


AATATCGTTT 


11640 




TGGCfiGGAAA 


ATATAACTCT 


CGTATGACCA 


GCTTCGGTAC 


TACTAAAGAT 


TATCAAGACG 


11700 


40 


GLTITTACAT 


ACAGAACACA ATGTTTCCGA 


TGCAACGTAA 


TGAGATTTCA 


GTAGATAATA 


11760 




CAACATTGTT 


ATCAACATTC 


ATTTATAAAA 


TCGCGAATGA 


GCGTTTATTT 


AGTCGTGAAG 


11820 




AATATCGTGT 


GCCGACAAAG 


ATTGATCCGT 


ATTACTTAAG 


TGATGACCAT 


GCAATAAAAT 


11880 


45 


TAGGTGAACA 


TTTAAAACAT 


CCATTTATTT 


TAAAACGTAT 


CGTAGGACAA 


TCTGGTATGA 


11940 




GTTATGGCGC 


TTTAGGAAAA 


AATGCCATTA 


CAGCTTTATC 


TAAAGGTCTA 


GCTAAAGCGG 


12000 




GCACTTGGAT 


GAATACAGGT 


GAAGGTGGCT 


TATCAGAATA 


TCATTTAAAA 


GGTAATGGGG 


12060 


50 


ATATCATTTT 


CCAAATTGGT 


CCCGGTTTAT 


TTGGTGTTCG 


TGATAAAGAA 


GGTAATTTTA 


12120 




GTGAAGGTTT 


ATTTAAAGAG 


GTTGCACAGT 


TATCTAACGT 


ACGCGCATTT 


GAGCTGAAGT 


12180 
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TTGCTAAAAT CCGAAATGTT GAACCTTATA 
TTATTCATAA TGCTGAAGAT TTGATTCGTT 
5 AACCAGTAGG ATTCAAAATT GTAGTAAGCA 

CGATGGTGGA ACTAGATAAG TATCCAAGCT 
CTGGTGCAAC ATTCCAAGAA TTACAAGATG 

w 

CTATTGTGTC TGGCATGTTA GAAAAATATG 
CTGGTAAGTT AGTGACACCA GATAAAATTG 
TAAATATCGC ACGTGGGATG ATGATTAGTG 

15 

TGAATACGTG TCCTGTAGGT GTTGCAACGA 
TTGGAGAAAA GCAATATCGT GTCACAAACT 

2Q ATATTGCAGC AGCTGTTGGC GTATCCAGTC 

ATCGAAAAGT CGATGGTGAG TTACAAACGA 
AACTTAATTA TTTCGGGAAA TTGAAAGCAG 

25 TATTAGTAGT GGATGCTGGT CACACAAGAA 

TTAAGGTTTG TAACCTTAGT CTTATCTGAG 
CCATTTTATC TTTTTCTTTT AAACCTCTGT 

30 TAATATCTTT ATTTTCTTTA GTCGAAACAC 

CTTCTGTGTA ATCTATGTCT AAGTGyTCAA 
ATTTTACGCC TTTAAGGTCT TTGAAAATAC 

35 

CGTCTTTATC CATACCTAGA TCGTCATATT 
CTTfATCATC TTTATATGTG ATAGAAGTTA 
ATGTTTTGGT TTGTTCTTTA CCACAAGCTG 

40 

CAATTAATGA ACATAATTTT TTCAAAGTCA 
GAAATTATAA CATTTACTAA AAAATGATGT 
TTGAAGATAT GAGTTTTTTT AAGCGGATTC 
AAAATGATAA AGCGkTAGGG AACGTTTTTC 
GAAATACAGG AGGATGAATA ACATGAATCA 
50 TGTAAACGGC ATTGCTGGTT ATGAAATGCA 

GCCTGTCAGT GATCAAATTA TTGAAGATAA 



AAACAATCAA TTCACCTAAC CGTTACGAAT 123 00 

TCGTCGATCA GTTGCAGCAA TTAGGTCAAA 123 6 0 

AAGTTTCAGA AATTGAAACA CTTGTACGTA 12420 

TTATTACGAT TGATGGTGGT GAAGGTGGTA 12480 

GTGTTGGCTT ACCGCTATTT ACAGCTCTAC 12 540 

GTATTCGAGA TAAAGTGAAA TTGGCGGCAT 12 60 0 

CGATTGCACT AGGTTTAGGT GCAGATTTTG 12 66 0 

TCGGTTGTAT AATGAGTCAA CAATGTCACA 12720 

CAGATGCGAA GAAAGAAAAA GCATTGATTG 12780 

ATGTAACAAG TTTGCATGAA GGCTTATTCA 12840 

CTACAGAAAT TACTGCTGAT CATATTGTAT 12 900 

TACATGATTA TAAATTAAAA CTCATTAGTT 12 960 

CGGATTTTAG CGTTACTGCA AATAATTTTA 13 020 

CTTCAAATAT TAAAGCCCTC AGAATATGAA 13080 

GGCA1T1TTA AGTTATAAAC TATTTGTCGT 13140 

GCTTTAATTG CTTTTCAAGT TTTTCAAAAC 132 00 

CAAGACGTTT ATTTAATTTT TTCATGTCAA 13260 

TTGCTTTTTT ATCTTTATAG TCTACTTTGT 133 20 

TTTCAGATTT GGCGAATAAC TTTTTGGCTT 13 3 80 

TAATTGTGTT GATTGTAGAC TGTTTTAAAA 13440 

GTACATGTTT ACCACTAACA TCACCwTCAT 13 5 00 

ATAATGCAAT GATACAAACT AATGCTACTA 13 560 

GTCGCCTTCT TTCGATATTT GTATTATAAA 13620 

TATTCAAAAA TTTAAATTTT GTCATTTTTT 13 680 

CTCACAAAAT TTTAAAAATA TTTAAGCCTk 13 740 

TGAAAGTTAG TGATACAATA GTTTTAAGTT 13 8 00 

GTCAGTCAAA TTACTTAAAC ATTTAACAGA 13860 

AGTTAAAGAA GCAATGCGTa ACTATATAGA 13 920 

CTTGGGTGGC ATTTTTGGAA AGAAAAATGC 1 3 9 3 n 
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AACAAAGATT GATAAACATG GTTTTATTTC ATTTACGCCA kTgGTGGATG GTGGAATCAA 14100 

GTCATGCTAT CTCAAAAAGT AACGATTACA ACAGATTCGG GCAAAGAAAT TAGAGGTATC 1416 0 

ATCGGTTCTA AACCGCCACA TGTCTTAACG CCTGAAGAAC GTAAAAAGCC AATGGAAATC 14 22 0 

AAAAATATGT TTATAGATAT TGGTGTTAGT AG CAAGGAAG AAGCTGAAGA AGCTGGCGTT 14 2 80 

GAAGTAGGCA ATATGGTTAC GCCATATAGT GAATTTGAAG TGCTTGCAAA TGATAAATAT 14340 

TTAACTGCGA ArCATTTGAT AATCGCTATG GCTGTGCATT AGCTATTGAG GTATTAAAAC 144 0 0 

GTTTAAAAGA TGAAAATATT GGCATTAACT TATACAGTGG TGCCACAGTG CAAGAAGAAG 144 6 0 

TTGGTTTGCG TGGTGCGAAA GTGGCAGCGA ATACGATTAA ACCAGACTTG GCGATAgcTG 14 520 

TcGATGTAGG TATTGCTTAT GATACCCCAG GTATGTCAGG TCAAACGAGC GATAGTAAAC 14 58 0 

TAGGCGGTGG TCCAGTTGTC ATTATGATGG ATGCTACAAG TATTGCTCAC CAAGGTTTGC 14640 

20 GAAAgcATaT TAAAGATGTA GCTAAGGAAC ATAACATCGA AGTACAATGG GATACGACAC 14 700 

CAGGTGGAGG TACAGATGCG GGAAGTATTC ATGTCGCAAA TGAAGGTATT CCAACGATGA 14760 

CAATCGGTGT TACGCTGCGA TACATGCATT CTAATGTTTC AGTGCTCAAT GTAGATGATT 14 82 0 

25 ATGAAAATTC TATCCGTCTT GTTACTGAAA TTGTCCGTTC ATTGAATGAT GAAAGTTATA 14 880 

AAAATATCAT GTGGTAATCA AATCCATAAA TAATAAAGAA TCCTTTTAAT ATGGTAGGTT 14 940 

GTTAAACAAT TGTCTAATTT TAATTCTTAG TCATTAGACA GTATCCATGT TAATAGGATT 15000 

TTTTGTTTTT AATTTAAATG CTGAAAATCA ATTATGCCTA AATTTTGATA TTACAAGAAA 15060 

ATGATTTTTT CTTAAATGTA ATTGCACTAA AAACCAAAAA AACGGGAATA ATATACCTGA 1512 0 

TATATTACAT GAGGAGCGGT GCAAATGTTG TTAGAAATTA AAGATTTAGT GTATAAAGCG 1518 0 

AGCGATAGAA TCATACTAGA TCATATCAGT CTAAAAGTAG ATAAAGGCGA GAGTATTGCC 15240 

ATTATAGGTC CATCAGGTAG TGGTAAAAGT ACATTTCAAA AGCAAATATG TAATTTGTTT 15300 

AGTCCAACTA GTGGAGAACT TTATTTTAAA GGTAAACCCT ATAATGATTA TGACCCGGAA 15360 

GAATTGCGTC AACGAATCAG TTATTTGATG CAGCAAAGTG ACTTGTTTGG TGAAACGATT 15420 

GAAGATAACA TGATATTCCC ATCACTTGCA CGTAATGATA AATTTGATAG AAAACGTGCA 15480 

AAGCAATTAA TTAAAGATGT CGGTTTGGGA CATTATCAAT TAAGTTCGGA AGTGGAAAAT 15540 

ATGTCGGGTG GTGAGCGGCA AAGAATTGCT ATAGCGCGCC AACTGATGTA TACACCGGAT 15600 

ATTCTTTTAT TAGATGAATC GACCAGTGCA TTAGACGTTA ATAATAAAGA AAAGATAGAA 15660 

AATATCATTT TTAAATTAGC AGATCAAGG C GTGGCAATTA TGTGGATTAC CCACAGCGAT 1572 0 

GACCAAAGTA TGCGACACTT TCAAAAGCGT ATAACAATTG TTGATGGTCA AATTTCTAAT 15780 
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CATTCCGATT ATCATTTCAT ATAAAGAAGG TTTACATATT ATTAAAGATT TAATTGTTGC 15900 

GACATTACGA GCAGTTGTGC AATTAATCAT TTTGGGATTT TTGCTGCATT ATATTTTTAA 15 960 

AATAAACGAT AAATGGCTGC TTATTTTATG TGTATTGGTC ATTATTATTA ATGCATCATG 16 020 

GAATACAATT AGTCGAGCAT CACCAGTGAT GCATCATGTG TTTTGGATAT CATTTCTAGC 160B0 

TAT CTTCATT GGAACGGCAT TACCGCTTGC AGGTACTATT GCGACAGGGG CCATTCAATT 1614 0 

TACCGCAAAT GAAGTTATAC CTATCGGCGG CATGCTTGCA AATAATGGCT TGATTGCAAT 16200 

TAATTTAGCT TACCAGAATT TAGATCGTGC ATTCGTACAA GATGGTACTA ATATTGAATC 16260 

TAAATTATCA CTTGCAGCTA CACCTAAATT GGCTTCTAAA GGTGCAATAC GTGAAAGTAT 16320 

TCGTTTAGCT ATAGTGCCAA CTATTGATTC GGTTAAAACA TATGGGCTTG TGTCGATTCC 163 80 

TGGTATGATG ACAGGCTTAA TTATTGGTGG CGTACCACCT TTACAAGCGA TTAAATTTCA 1644 0 

ATTGTTAGTC GTGTTTATTC ATACAACTGC GACCATTATG TCTGCTTTGA TTGCGACATA 16500 

TTTAAGCTAT GGTCAATTTT TCAATGCAAG ACATCAATTA GTAGCACGAA ATACTGATGT 16 560 

TAAGAGTGAA TCATGATAGA TTTTACTGCA TCAGATTTAG GCATTAGTTT TAATTGGAAA 16620 

25 TGAAGTGACG CGCACATATA GTATCGCTAT TCATTAGCGC AGCGAAAATA TTCATAAAGG 16680 

CACGCATACT TTGTAGTCAG TTATCTGTTC TGACATATAA AGCGTGCGTG CTTTTTTGGA 16740 

GTTATTGTTG AAACTGAAGT AATTATACAT AATTATTAAA TGACATACTT GTGTTAATTT 16 8 00 

TTCAAATACT GAAAAACAAT TTCaATAATT TTCCaATTAA GCACAGAAAA TTAAAGCAAA 16 860 

ATATTATATA ATAGAACGGT TATATATaAA JlATTngTgCA CACATTTTTT AATAAATCGT 16920 

TATTCTAAGG GAAATGAATA TCGGAAATTT TGTTTGAAAG GAGTTTTAAA TTGTCAATCA 16 980 

TGCGACTATT TACATTCATT TTAAGTATTT TTATCGTAGG AATGGTTGAA ATGATGGTTG 17040 

CAGGAATTAT GAACTTGATG AGTCAGGACT TACATGTATC AGAAGCTGTC GTTGGTCAAT 17100 

TAGTGACAAT GTACGCTTTA ACATTTGCGA TATGTGGACC TATTCTGGTT AAATTAACGA 1716 0 

ACCGTTTTTC ATCAAGGCCT GTATTATTAT GGACATTACT TATATTTATC ATTGGTAATG 17220 

GCATTATTGC TGTAGCGCCA AATTTTTCaA TATTAGTAGT TGGTAGAATT ATCTCATCTG 17280 

CAGCAGCAGC ACTAATTATC GTAAAAGTAT TAGCTATTAC AGCGATGTTA TCAGCACCTA 1734 0 

AAAATCGTGG TAAAATGATT GGACTTGTCT ATACAGGGTT TAGTGGTGCT AATGTTTTTG 17400 

GTGTACCAAT TGGAACGGTT ATCGGCGATT TAGTAGGTTG GCGCTATACA TTTCTATTCT 1746 0 

TAATTATTGT GAGTATTATT GTTGGCTTCT TGATGATGAT CTATTTACCG AAGGATCAGG 1752 0 

AAATACAACG AGGCCCTGTG AAT CATG AG A CACCATCTCA TGAAAATCAT GTTACTTCGA 1758 0 
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CAAACTCAGT GACATTCGTC TTTATAAATC CACTTATTTT ATCTAATGGT CATGATATGT 17700 

CATTCGTTTC ATTAGCACTT CTAGTAAATG GAATCGCTGG CGTTATTGGA ACATCATTAG 17760 

5 

GTGGTATATT CTCCGATAAA ATTACAAGTA AGCGTTGGTT AATGATTTCT GTTTCTATTT 17820 

TTATCGTCAT GATGTTACTT ATGAATTTAA TCTTACCTGG TTCAGGTCTA TTGTTAGCAG 17880 

GACTATTTAT TTGGAATATC ATGCAATGGA GTACTAATCC AGCAGTGCAA AGCGGTGTGA 17 940 

W 

TTCAACATGT TGAAGGCGAC ACAAGCCAAG TAATGAGTTG GAACATGTCT AGTTTAAACG 18 000 

CTGGTATTGG TGTTGGAGGC ATTATTGGAG GCTTGGTCAT GACACATGTT TCTGTTCAAG 18 060 

15 CTATCACATA TACGAGTGCC ATCATTGGCG CATTAGGATT AATCGTTGTT TTCACATTGA 1812 0 

AAAATAATCA TTATGCTAAA ACATTTAAAT CATCATAATT CTCATATGAm AAGCACGCCT 18180 

GCTATCAAAT TCAGGTGTGC TTTTTTAGAT GCGATAACGT TATTGATATG TGCGATAATA 1824 0 

20 GCGACGTTCA TTATGATACA TCGGCCAAGG CATTTTACCG CTTTTAGCAA AATTAGCTAA 18300 

ATCATTTTGC ATTTGTCGAC TTAAAAATTT AAGGTGaGCA GTTGTTGGaT ATgAT 183 55 
(2) INFORMATION FOR SEQ ID NO: 68: 

25 ■ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1192 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

CGCAAAGAAG TACAAAAAAT GTTTTTACAA GAAGGTATTA AAACACCTCA ACCAATTATG 6 0 

35 

ACTGCTTATA ATCATAGTGA AAACGgTGTT TAGTAGTTTA TAATACATGG AGGTCATATT 120 

TAATGGCGTC AAAATATGGA ATAAATGATA TAGTAGAAAT GAAAAAACAA CATGCGTGTG 180 

4Q GAACAAACCG TTTTAAGATT ATTAGAATGG GTGCAGACAT AAGAATTAAA TGTGAAAATT 24 0 

GTCAAAGAAG TATTATGATT CCACGTCAAA CGTTTGATAA AAAACTTAAA AAAATCATCG 3 00 

AATCTCATGA TGATACACAA AGATAGGAGA ATGATTAATG GCTTTAACAG CAGGTATCGT 360 

45 TGGATTGCCA AACGTTGGTA AATCAACATT ATTTAATGCA ATAACAAAAG CAGGTGCTTT 420 

AGCAGCGAAC TATCCATTCG CTACGATTGA TCCTAATGTA GGGATAGTAG AAGTGCCAGA 4 80 

TGCTAGATTA CTTAAATTAG AAGAAATGGT TCAACCTAAA AAGACATTGC CGACTACATT 540 

50 TGAATTTACA GATATCGCTG GTATTGTGAA AGGTGCTTCA AAGGGAGAAG GGTTAGGTAA 6 00 

TAAATTCTTA TCACATATTA GAGAAGTAGA TGCGATTTGT CAGGTCGTTC GTGCATTTGA 660 
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TAATATGGAA TT AG TACT AG CGGACTTAGA ATCTGTTGAG AAACGTTTGC CTAGAATTGA 7 80 

AAAATTAGCA CGTCAAAAAG ATAAGACTGC TGAAATGGAA GTACGTATTT TAACAACTAT 84 0 

5 TAAAGAAGCT TTAGAAAATG GTAAACCCGC TCGTAGTATT GACTTTAATG AAGAAGATCA 900 

AAAATGGGTG AATCAAGCGC AATTACTGAC TTCTAAAAAA ATGCTTTATA TCGCTAATGT 96 0 

TGGTGAAGAT GAAATTGGTG ATGATGATAA TGATAAAGTA AAAGCGATTC GTGAATATGC 102 0 

10 

AGCGCAAGAA GACTCTGAAG TGATTGTTAT TAGTGCAAAA ATTGAAGAAG AAATTGCTAC 10 8 0 

ATTAGATGAT GAAGATAAAG AAATGTTCTT AGAAGaTTTA GGTATCGaAG AACCAGGATT 114 0 

AGATCgrTTA ATTAGGAmCA ctTATGAATT ATTAGGnTTA TCCACCATAA TT 1192 

15 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

AATATAGCTG CAATAGCATC TCGTTTCATT TGTATAATCA ATTCCGGTTT AAATATCAGT 6 0 

GTGAACGTAA GCACGACACA GATTAAAAAT AACACTGCCG GAATGAGTCG TTTCAATCGT 120 

30 CGCTtCCAAA ACTCTAGCAA ATCGATTTTT TGCGTCCGAT AATACTCACT TATCAACAAA 180 

CTTGTTATTA AATAACCTGA AATAACGAAG AATGTATCTA CTCCTAAAAA GCCCCCACTT 24 0 

AACCATTGTG CATTCAAGTG ATAAATAATG ATTCCTATAA CTGCGAATGC CCTCAATCCA 300 

3 * TCTAATCCAG GTAAGTATCG CGGGGAATAC ATTTTTTCTA AACGTTTAAA GTCTTTTGTA 360 

TCCA'&TTAA TAAACGCCCC ATTTATTTTT CTCTATTTTG TAGTATATCA CAATATTTTT 420 

GAAAATAAAA TATTGCACTG aTTTTCATTA ATTGATTTAA CCCTTAATTA AGATAGTTTT 4 80 

40 

AAATTTTTTA TTAAGTAGAA AACAATTATT ACAGTTGATT TCATTACTGC AAACCACATA 54 0 

TAAATTTGTC GATTTTACTA CATAACATAG ATTATCATAG ATTCTTGAAT TTTTAGCAAA 6 00 

ATAACTGTTA TTTTCATTAT ATTTTTACAA AAAAAGGTTC GTTTTATATT TTATGCATCT 660 

TACTGTAACA GAATCATTAA GATATGCTAT TCGAATATAC TTTTTCAAAA TTTATATAAT 72 0 

GAATAAATTA ACATGTATTG AAAAAAAAGC GAAATGCAGC CTATCCTCTA ATGTAAACCA 78 0 

50 AACGATATAT CTCGTCAGAC TTTATATTTA AACGCTATGT GTCACTTTTA AAATGAATAT 84 0 

TACTAAGATT GTCATATCAA TTATTATTGC ATCGAATTAA TCTTTTAAAT TTCTGTAATA 90 0 
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ACGGAAGTCA TTATTAGAAT AAAAATACTG TGCACTAATA AATTTATCAA TTGTTCCTAA 



.020 



ATAAATACCA TCGATATTTT GTTCTTTACA TGTCATTATA ACTTTATCTA AAAGTTTTTT 1080 



1140 



ACCTATTTTT AAATTCCTAT AACCTTTATC AACAAACATT TTTTTAAGTG CAGACATATT 

ATTATCTAGT CTAATCAAAC CTATAGTACC AACAATATTT TGaTGATTGT TTATTGCAAG 12 00 

CCAAAATgCC CTCCATTATT CAAATAGTTA TGTTCGATGT TCTCCAAATC AGGTTGATCA 12 60 

TCTCTATCAA TTTTTATATa AATTCATTTT TTTGAATCGA TAAAATAAAC TCGATTAGCT 13 20 

CTTCCTTATA AGACCTATTA TATTCAATTA TGTTTATAGC CATTTTTATC TCCTTTTTCA 13 80 

TTTAATTTAA TTATAAAATG TGCGTTTAGT TTGTATCTAG TGTACTCAGT ACAGCCTCAA 14 4 0 

ATGAAGTTTC ATTCCACTTG GCACTTAATA AAGACAAGTA TTTTAGCAGT AATACAATAA 1500 

AGTCCAATAA ATTTCCCTAA CTTCAATATC CACTTTTTAA AAAATGTATT TTTAATTAAT 156 0 

AAAAAAACTC TCCCCAATTT CTATGGGAAG AGCTATATAT TTAATGTCTA AACATTACTT 162 0 

TTATTTATTA TGAAGGAATT AGAATCCCCA AGCACCTAAA CCTTGTGCTT TGTATGCTTT 1680 

AACAGCTGCG TTGATTTGTT GGTCAACAGT GTTTGTTGGA CCCCAACCTG GCATAGTTTG 174 0 

GAATAAACCT GAAGCACCTG ATGGGTTGTA AGCATTTACT TGACCATTTG ATTCACGAGC 1800 

GATGATTGCA GCCCATGTAG AAGCTGAAAC ACCAGTACGT TGAGCCATGA TTTGAGCTGC 186 0 

TGATGAACCA GTAGCACCTG CAGTATTACC ATTGCTTAAT CTCACTGAAC TTGAAGTAGT 1920 

TGAAGTGCTG TAGTTATGGT AAGTTGGAGC TGAAACAGCT TCAACGTtTG AGTTACTTGA 1980 

TTGTGCATTG TAGCTTACTG ATTGTACATT TGAACCTTGG TTGTATGAAG TAGTGTAGTC 2 04 0 

TGCACCTGCA ACGTTTGAGA AACCAGCAGT TTGACCATTA GCTGCTTCAT AGCTCCATGA 2100 

CCATGTAGTA CCATTTGAAG TGAAGTTATA TTGGAAACCA TCTTTTACAA AGTGGATGTC 2160 

ATATGCACCA TCTTTGATTG GAGCTGCATT TAATTGATCT TGGTGATTAT GCGCTAAGTC 222 0 

AACTAAGTGT GCTTGATCAA CGTTTACTTC AGCAGCGTGT GCTTGATGTC CTGTACCTGC 22 8 0 

TGCGTAACCT GTTACACCTA ATGCCACTGC TAATGATGAT GCCATAATTG TCTTTTTCAT 234 0 

AGTAAAAAAT CCTCCAGTAA TAATTGTnAG TTTATGTTTT TAGTAATTAT AtTTTGaATT 24 0 0 

TGAATGTCGT AGTgCAAGTT TAAATTGTCT TTTATTTCTT TCaACGGTAC TCACTATATC 246 0 

ACAaAAAACC AGCCAGTAAA TTACACTTTC TTTACAAAAC ATTACAATAT CAAGTGTTAT 2 52 0 

TTGtAATGTT GAAATATGGC TGTTTTATAC TGTAATGTGA AATATGTGCC CTTTAGAATC 2580 

CAATCAACCC TTGAAATAGT CTTTAACACA TAAGATTTTT ACTATATTTA GCTCAACTAT 2 64 0 

TACAG CTTTC GTAATATTAC AGATTGTATT TTTGTTACAT AGCTGTAATA TATCTGACAT 2 70 0 
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TACACATGTA TTGATTGCTA TTATTGTTGT ATATTCAAAG TTTTAAAACA CACATCTTTT 2 82 0 

GTGAATTGTC TTATCTTTTA TTAGCGCAAA TAAACTGCAG CTCAATTATA TTGTTCAACT 2 380 

TCATTCTCGC AATTCACAAT AACATTAAAT AATTTTTGGT CTCATATTTT CAAAAAACAT 2 94 0 

ACTGTTATTA TCCCATGAAT TTAAAAATAT CATTAGTATA TAAACGAAAC ACTTTACGAT 3 00 0 

AAATGATATC TGCAAGCCAA GCTGTTACAA ATGGTACAAC AAAGAACGCT ACTACAATTA 3 06 0 

GTAAGACACT CAACCAAGCA GAATCAACCT CCATAAATTT AAATGCATTA ATCGGTCCTA 3120 

CCATTCCTAT AAAACCAAAT CCAGCTGACT CTTTCGTTCC ATGAATACCT ACTAATGCTG 3180 

ATACCAAACC TGATACAATG GCTGTCGTTA ATATTGGTAA CATAAGAATT GGATATTTCA 324 0 

CCATATTAGG TATCATCATT TTAACGCCTC CAAAGAAGAC GGATAACGGC ACCCCTAAAC 33 0 0 

GATTCACTTT ACTTGTACCA ATTATCAATA CTGCTTCAGT CGCGGAGATA CCAATTGACG 33 6 0 

CTGATCCAGC TGCTAAACCT GTAATACCTA TCGCAAAGGC AATGGCCACA GTTGATAGTG 34 20 

GCGAAATAAT AATAAGACTA AATACCATTG AAATCAAAAT ACTCATGACA ATCGGTTGTA 34 80 

ATTCTGTAAA ACCATTAACC ATATTACCGA TGGCTGTTGT AATCATTTTC GTATACGGCA 354 0 

25 ATATTAAAAC ACCAATTGCA CCTGAAATAC CGCCAACAAC TGTTGGGAAT ACAATCAATG 36 0 0 

CCATACTACC TACGCGATGT TGAATAAGTA AAATGAATAA CACTGCAATC GCTGCTGTAA 36 6 0 

TCATTGTATT AATTAAATCA CCAATACCCG TAATCATCCA AGCACCATTT TTAAACTGCG 3720 

CTGCACCGCT TCCTACATAT GCTGCACTTG CCACAACAGC AATTGCTAAT GGCGATAGGT 37 80 

CAAATTTCAT GGCAACCAAT GCACCAATCA AAGCAGGTAC TGTAAATTGA ATTGCAACGA 3 84 0 

CAACGCCTAA TAACGTTTTA AAAATCGGAT GATAATCCAT AAAGTATTTA AAAATTTCTC 3 900 

CAAGTATCGC ATTAGGAACT AAACCCGCAA CAATACCTAT GGCGACACCT GATAAAACTC 3 960 

TAAATATAAA ATCTTTGGGT GTAATTGTTT TAATTGATGT CATAATATCA TCCTTCCATT 4 02 0 

TATGTATATA CATCTGTATG CAAATAATAA AGAGCCTTAA GTTATAAGCT GCCACTAGCT 40 8 0 

TAAATTCTAA GATGTGCATG CCGATGTTGT TATATTTAGG CTAGCAGTAT CATCTATAAC 414 0 

TCAAGACTAT GAAAAATAGT ATATCACAAA ATTCTGAATT TTTAGATAAA TAAATTGGCA 4 2 00 

45 ATTTTTCAAA CATATTGTTA CAATACACTT TTATTTTATC TTCATTTTTA AAATGCATTA 4 260 

ATACAATAGA AGAAAGACAT TCAAATGCTT ACCAAAAAGG TACATTATTT GTTAGGAGCG 4 3 20 

TATCAGCaCT TACATATCAT CAACACAATT GACAATATAA TAGAAGATAC TGATAATAAG 4 3 80 

so 7GTTAAAACA ACAGATGTTA GGTAGTGAAC AAATGATGGA AAGTAAATCC ATAGATCCAA 444 0 

r:AAT~GTTAG AACrAAACAA T^G CTTOT GG ATOCTTTTCT TAAAATTTGT AGAGAAAAGA 4^n 
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TTTACGCTCA TTTCGCTGAT AAAGAAGACC TCCTAGACTA CACATTATCT GTAACCATTT 4 62 0 

TAAAAGACTT GAATGATAAT TTGAGCATTT CTAATGTCAT TAATGAAAAG GTTCTGCGTA 4 68 0 

5 

ATATTTTCAT TTCAATTGCG AGTTATATCA AAGATGCTGC AAAGTCTTGC GAATTAAATA 4 74 0 

GTGAAGCATT TTGCAACAAA GCACATCAAC GTATTAATAA TGAATTAGAA GATATTTTTG 4 80 0 

CGATTATGTT AGAAAACAGC TATCCGGAGC ATCAACGAGA TATCATTGTA AATAGTGCGA 4 860 

10 

GTTTTTTAGC AGCTGGTATC TCAGGCTTAG CATTACATTG GTTTAACACG AGTCAAGAGA 4 920 

CAGCCGATGT GTTTATCGAT CGCAACCTTC CATTTTTAAT TCATCATATA GCACATTTTT 4 980 

15 AATAAAACTT GGTATTTAGT CATGCATCTT GAAATCACTA TGTGACTTAG GTTCATACTT 504 0 

GTACACACAA TAAAATTTAA CGTATTACGA TTGATTAGCC GTGTCTAGGA CATAAATCAA 5100 

CGTCCTATAC TCTACAATGT CATATTAGCA GTCGTTAACT GAATGAAAAT AAGCTTGTCA 516 0 

20 TTAAAACATA TAGATTTTAG TGACAAGCAT TTTTGTTTTT GCGTACTTAA ACAACACTTC 522 0 

AGGCAATATG TTGTTTAGGC AACAAATGAT ATGTGCGTGT TTATTGGCAA ACGTACGACA 528 0 

TAGTAGTATA GTATGTCTAA ACAACATATG TTGCATAGTT GATATGCGTT GTTTAAATAC 5340 

25 TAAGATAGGA GGGATTGACG TGAGCGAGAC AGATGAACCT CAGGGGTTTG AACGCACGCA 5400 

TAATATATTA AATATTAATC AGAGTAGTCT GGGTGTAGTG ACATACATTA CAAATAAATT 54 60 

AAAGTCGACG TTGAAGCAAC ACATAATAAT TGCTCGTGGT AAAAAGCGAA TCGACTATCG 552 0 

30 

ACTGTCGTAT AACTTTTACA TACGTATTAT GATAATGTAG AAATCAAGAA AATCGACTGT 5 58 0 

GAATATACCT ATGCTATGCC CATTGCAATT TTAATAAGAC ACACGATGTC ATTCGACAAT 564 0 

GCTCATTTCT TTGCTCAGTT ACGTCATCCT GTCTTATAAA ACAACATTGC AGACATGTAT 570 0 

35 

ATCAAACGAC ACTTCAATAA CATCACTTTG CCcATCGTAC TACTAGTAAA ATCGTGTCTC 5760 

AAATCCCTTA TTTTAATTCC AAAAAtCTGC TGGTCAAAAG ACCGAGAAAC TAAAAACATT 5820 

ACTTAATGTG TTGATAAATT ACCATATAAA AATAATCTCA AAATATATCA ACACTTGATT 58 8 0 

40 

CTAAGGAGGA TATGACAATA TGAAAATTTT AGATAGAATT AATGAACTTG CAAATAAAGA 594 0 

AAAAGTACAA CCACTTACTG TAGCTGAAAA ACAAGAACAA CATGCATTGC GTCAAGAcTA 600 0 

45 CTTAAGcATG ATCCGAGGAC AAGTATTAAC AACATTTTCC ACAATAAAAG TGGTTGATCC 6 060 

AATCGGTcAG GATGTCACAC CAGATAAAGT TTATGATCTT CGCCAACAAT ACGGTTATAT 612 0 

TCaAAATTAA tATTTGCTCA CGAGGTATTG CACTTAAGGT GCCAACTGAC CTCATAAACA 618 0 

50 AAGCCCATAC TGATTGAAGA CACTAATGTG tCsaCCATGG TGCACATTAC GCTTCATCTC 624 0 

TGTATGGGCT TTTTATTTAT TCTTTTGAGA ATTTCATTTT AGCAGACCAA AAAATTAAAA 63 00 
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TGAACGACTG TGCCACCCGC TTCTTTCACT TTATTCACCA ACTGGTCAAC TTCTTCATTT 64 20 

GTGTTCACAC CTAGAGAAAT CATCACTTCA TTTGGTTCAG TATTAAGGCT TTGCTGACTT 64 80 

ACATTTTGAA AATGCTTGTn TTCTATTAAA ATTACGGJcTG tTTGACCTAT tTGAATGCCG 654 0 

ACCATTTTAT CTAACATTTG TGGGTTTCTA TTTATTTTAA ATCCTAACGC TTTATAAAAC 6 6 00 

TGTGCGCTCT TTTCTAAATC TTGCACATGC AAATTAAACC ACATTGATTG AATCATGATT 6660 

GCACCCCATT CATTACTTAT TATAGTTTTG GACTTTAAGC CAATCACTTA ATGATAATCT 6 72 0 

TGTTGGATTT ATTTCAGCCA TTAATTCAAA GTCTACTTCA TAACCTTTTT CTTCCAACCA 678 0 

TTGCTTTTCT GCAACACCAC TAACAAATTC TCCTTCTATA ACAGTAGATT TACCTGTCAC 6 84 0 

TTCACTAAAA ATTGTTGCTG CTTCACTTAA TGTAACTTCA TCGGAACCAA TCTCTATTGA 6 900 

TTGATGCGTA AAGCTTTGTG GATGTGCAAA AATATACGAT GCAATTTTAG CTATATCAAT 6 96 0 

20 AGAAGAAATC ATTGTGAATT TTATATTCGG ATTAATAAAT TCTGGTAATG TAATACGTTC 702 0 

ATCTTCGACT TTAGCAATGC GTAAAAAATT ATCCATAAAG AATGATGGTT TGATAACTGT 7080 

TGCATTTATA TTAGATTCCA TTAATCTATT TTCTATTTTT GCTAGTACTT CAAAGTGTGG 7140 

25 GCCAGTTCGA TTTCGATTAA CCCCTCCCGC AGTACTATAC ACAATATGTT GAATATTTTC 7200 

7TGCTCAGCT ATTTCAATTA TCTTCATACC TTGTCTTAAT TCTTCGCTAA CATCATCTTT 7260 

AACGATTGGC TGAATACTGT ATAAGCCATA CTTACCTTTC ATCGCTGATT GCAAACTAAC 73 20 

ATTATCACTC AGATCACCTT CArCGATTGA TAAATGCGGA TGTCCTATGT CTGAAAGTTT 73 80 

ACGATTATnC TTATTTCTAG TTAATGCACT TACATACCAT CCATCCTCTA ACAACTGTTT 7440 

TACAACTGCA TTACCTTGCT TCCCTGTTGC GCCTATTACn AAAATATCTT TCAT 74 94 
(2) INFORMATION FOR SEQ ID NO: 70: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11802 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 70; 

AATTTATTTC GCCGTCCCAC CCCAACTTGC ATTGTCTGTA GAAATTGGGA ATCCAATTTC 6 0 

TCTTTGTTGG GGCCCcGCCC CAACTCGCAT TGCCTGTAGA ATTTCTTTTC GAAATTCTCT 12 0 

GTGTTGGGGC CCCTGACTAG AATTGAAAAA AGCTTATTAC AAGCGCATTT TCGTTCAGTC 180 

AATTACTGCC AATATAACTT CGTAGATCAT AGAACATTHA AT^rr^r — A - T — - < - 
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AGCAAAGGTA ATAATGATAT TAATAATGTA CAAAAAATAT AAATCAAATC GACATCCTTA 360 

TAAAACATCA GAACCACTAA AAACAAAAAA GCACAAAATA AAATTAAATT TAAAATAAAC 420 

GACCACTTTT CAAAAAAATC TCtTTTCaTa TTTCCACCCC TAATTTTAAT AAGCATTATT 4 80 

TTATATTCTC TTTTAAGTTT ATTATTCAAA AGGAAAACAG AAATATCTTT CaATATTATT 54 0 

ATAAACATTT CAACTACTTT TAAAAACCAA CAAAAAAATA CTTATTTTAA GTAGATGAGC 60 0 

ATAAGTGAAC ATAGTTCTTT AGTTATAATA ATTAATTCAA CCAAAAGTCG ATTTGTTTTT 66 0 

GCAATTGGTT TTCATTTCCT CTTAAAGATA TTTTCATTAA ATCTGTCAAA TCAATAGACG 72 0 

CTATATTTTT CAACTTATCT CTATATTTAT TTTTAGTACG TCTTTCTAAA TTTCCCCATT 780 

CCTCTTCTTC GTGAGTTAAT AAATGAAGCA TTGCTCGTTC TTGTATATTT TCAATCATTT 84 0 

TTAAATTCGG TTTTAAAATA TGCAAATCAT CAAAACAATC TTTCCAACAA TCAACCATAT 900 

CTCGTTTTAA TTCAATTTCC ACACGCCATA GAAATGTTGA ATCAATTTCA ACATCTGCAT 96 0 

TATCTTTACG TTCTTGTTTT TATTATAAAT CCGAATAAAC CTATCACTAT TACGCACACC 1020 

AAAATATTTT GTTTCTGGTT TTACATTACG TCCATAAAAT ATAGTTTTCT TTACCGACTT 1080 

ATCTGACAAT GCATAATAGT CATTTAAATC AAATTCAAAA TCAAAAGCCA AATCTAATCT 114 0 

CGTAAAACTA ACATCGTCCA AATAACTGAT GATATTTTGT TTTAACCAAA GCACTTCATC 1200 

3Q ATGCGAAAGC TTATTAGGAT TAAATTCAAC GCGCATAtAC GTCTATTCCA AAGAGTTGCT 12 6 0 

TTTATTTTGT CATATTCAAT ATAAACTTTT TCTTTAAGAG CTTTAGCTTT AAAGTTTGTT 1320 

TGTAAAATAT CCCAAAGCCG AATTTCAGGA TTAGTACTCA TAAAATGTGA AAGTCTCTCT 13 80 

35 GCGTTAGACA TGCTAAGATT CCCAACAATC GTTATAGCGT CAAAAGACAA TTTTGGAATA 144 0 

GCTAGTGACA TCCTATGTCG ATTTAACCGG CTATTACCGG ATATTAGAGT ATCCAGTTTT 1500 

ACAAATGGAT GAAACGAAAT TCAAAACACT AAAAAATATG TTCCACTAAC AGCAAAAAAA 156 0 

TACCATTATG TTCCTACTAA AAAACyAAAA ATACTGGAGA ACAAATGTCA GGATATAACT 162 0 

TAGGATACTA TGTAATAAAA ATTTACAATA AAAAAACAGG AAAACAAATT TCAAGTAAAA 16 8 0 

GmATACCCAT ACAAAGAGGA TAAAATAAAA AACCTCGAAC TGaAATGATG ATCTTTTCAG 174 0 

CTCGAGGTTT AAATATTGGT GCCTTATTTA TATAGATTCG TTATATTATA TTCTCTATTT 180 0 

TCATTAACmT AAT CCTTAAA GAGTTTTAAA TTAATACCTG CTAGATGATT CAAAAATGTT 186 0 

50 TCATCAACTT TTAAATAATT CAATAATTTT TGTGGTGTCA GTAAATnTCT AT CAAAATA C 192 0 

AACTTTAATA AACTATTCAT TTTGACAGGA CGTGACATTT CAATCACGTC GTCTAAAGAT 1980 

AATACTTTCT CGCTTTAnAC AAAnACAAAA ACTTACCCGA TTAAAATCAA GTAAGTTTTA 2 04 0 
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TATTTGATAA AAAATCAATA AGTAATTGTG CGCCTTCAAC TTGAATATCT TTTACAACTG 2160 

GCGCGTCGAT ATACATATCA TACTGACCAC CGCCTACTGC ACGATAATTA TTTACACAAA 2 220 

5 

TTGTATATGT CTGCTTTAAA TCAACTGCGT GACCTTGAAT CATCATATTG CTCACACGTT 2 2 80 

GTCCCTTTGG TCTTCCAACA TGAATGGTAT AACTTACGCC ACCATATATA TCATAATTAA 2 34 0 

AGTGTTGTGG TTTGGGTTCA AGGAAGTCTG CGCTCACACT AACTTCATCA TTTTTCACGT 24 0 0 

w 

CAAAATATTC TGCTGATCGT TCAATGGCTT CTTTAAGTTT GGCACCACTT ACAGCTAAAA 24 60 

CTTTAAATGT ATTTGGAAAT GGGTAATTGT TAATAACATC TCGCATCGTC ACGACTTGCT 252 0 

15 TGAAACCACT AGCAGAATCA AACAAAGCTG TACAGGCAAC ATCTGCGTCA CTTTTTTCTA 2580 

ATAAAGCGTA ATTCATAAAA TTTGTAAAAG GATGCGGTGC CACACGTGCC TCAAATGCAT 264 0 

GATTAATCGT CATATCATAT GGCAATGTAG TAATTTCGTA ATCTAACCAG TCCTCTAACT 2700 

20 

a r ttt r'QT' aji ATGTTGGTCA TCTTGATCAA TAG T AAATG T GGAATCATCT ATAACAGGAA 27 6 0 

GTAATTCACA TGATTCAACG GATAGATTTT CAT ATT CATC AGTACTCAAG ACTACTCTGC 2 32 0 

CTACAGTTGT ACCTCTCGTA CCAGGTTGAA TCACAGCCGT TTGCTTAAAC CTTTCAGCAA 2 88 0 

25 

TTTGTCGATG TTGGTGACCC GTAATAAAGA TATCTATATC TTTAGAAAAC GCTTCTAACA 2 94 0 

TGGCATATCC TTCATTTTCA CCCGTTAATA CTTCGGTCGG CGTACCACTT TCTAAATCCT 3 000 

30 TTTCAAATCC ACCATGGTAA CAAACCACAA TGATATCTGC ATGTCGCTTC ATTTCAGGTA 3 06 0 

AGTATTGTTG AAGTATTTCA AAAGCACTAT GAAACGTArT GnCnTGAATA TGCTCTGGTT 312 0 

GTTCCCAATG GGGAATAAAT TGTGTCGTTA AACCTATCAC ACCAACAGTT TGATCTCCAA 3180 

35 CCTGAAAATA CTTCACACCG TTATCAGTCA ATGTACTATC ATTTTCATAT ATATTAGCGC 3 24 0 

ACAAAACTGG ATAATTGAGT CTGCGTAAAG TGTCTTTTAA GTATGGTAAT CCATAATTAA 3 3 00 

ATT CATG ATT ACCAAGCGTA CCAAAGTCGA ATGCCATTCG ATTATAAAAA TCAACTAAAG 3 3 60 

40 

GCTGGCTACT GCCGCTATGC GCGATTAAGT AATTACAAAA TGGTGACCCT TGCAAAAAAT 34 20 

CACCATTATC TATTTTAAAA CTTTGGTCAT ACTGCCTTCT GTsTTGTTCT ATAACATGAT 3 4 80 

TCGCTAGTAA CAATCCCATA GGTTGATATT GATTTCTACT CGTAAAATCT GTTGGGAAAA 3 54 0 

TATAACCATG TACGTCACTC ACGACATAAA ATGCTATGTT TGACATCCTC ACTCACTCCT 36 00 

TCAATCACAA ACATCTTTCT TATTTCTATT ATATATTTAT TTGAAGTCTG TTGTAATCAA 36 6 0 

so GGTTTTGTCA CCGAGTTTTA AACGAATCTT TGAACCTTCC ATACTTTCAA GTACTTTAGC 3 72 0 
ATTGA^TTA ATTGTGACAT TTCCGTTTTr ATCTGCTTTA ACTGTTGGCA AAGTACTGTA 
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TATTGTCATT 


TCAAATGGCT 


CATTTACAGA AACATTTTGC 


GGGATATCAA 


ATGTTACTTT 


3960 


TTCGTTCTGA 


TTTGGTGGTG 


TATGATCATC TGGTGTGTTT 


GGCTGAGGAT 


CTGCGCCTTT 


4020 


TTCGCTGCCA 


TAACTACCTG 


CTTTAAATGT TGTTGGATCA 


TACCATTTAT 


AACCACTCGG 


4080 


CGGTTGTGAC 


CATGGCTCTT 


TTTCAGGCTC AGTTGAACGC 


TCTGGTCGTT 


CAAAATCAAG 


4140 


CAACTTAGTC 


TTTGTATCTA 


ATGTTAGGCT ACTCGCCTTA 


AGTGATTTCC 


CATCATTATC 


4200 


TTTAGACATC 


CAAGCCGTTA 


TATTATTTAA TAGCTTACCG 


TTGTCTTGTT 


CTTTAAAACC 


4260 


ATCATATGTT 


TTCTTCTTTT 


CTCCATTATC TTCTCTTACA 


TATTTGGGCG 


AACTATCTTC 


4320 


CACAAGTGAT 


GAATCACCGA 


TAAATGCTGC TTTACCTTTT 


CCAACTTTAG 


AAATTGCTAC 


4380 


AT AGGGGCCT 


TCTGCTTTAC 


CGCCCCCATT ATAAATACCT 


TGATCTACAG 


CATGTGACCA 


4440 


TTTACTTTTC 


GCTGGCAATT 


GTTCTGGTGT ATACACAATA 


CCTTTTGCTT 


TCTCTGGATT 


4500 


AGTAATTGCT 


AATGTCGATC 


CGGCATGCAT AGAGACAGAT 


TTCACACCTT 


CAGTAATACC 


4560 


GAAACTTTCT 


TTTGAAGAAA 


CAATATTGCT CGTATTTAAA 


TCACCTAGTG 


CATTATATCG 


4620 


AAAACGTACG 


CCAAAGTTTG 


TAGATAACCA ATCTGAACTT 


TTCACACCTT 


GCATTGCAGT 


4680 


AGAACTTTTT 


TCTTCTGCAT 


TCATACCTTT CGACATATCT 


TCATATGCTC 


CACGTCGATA 


4740 


ACCATTCATT 


GCCTCCGATG 


AATCAATACG ATTTAAATTT 


CGGTCAGCAT 


TGTAATGATC 


4800 


TGAAATAAAG 


ACAACATTGC 


CACCTTGTTt CACATATTTA 


ACAATTGCTG 


CCTGTTCTGA 


4860 


TTCTTTGAAA 


GGAATGTTAG 


CCTCAGGAAT TACAAATATT 


TTGGAACTTT 


TCAAACTTGC 


4920 


TTCTGTTATG 


TTCGAATGAC 


CATCAATAGC TTTAACGTCA 


TAACCTTGTT 


TTTGTATTGA 


4980 


ATCCGCATAA 


TCTGAAAATG 


CACCATCACT AACCCAATCT 


GCAGCACCAG 


CTGTTTGACC 


5040 


ATGAGAACGA 


TCGAATAATA 


CCGTTCGCTG TTGCTTTGTA 


GGTTGCGATT 


CATGCGTTAT 


5100 


AGCTftAAGAT 


TGCGGTAAAG 


CACTTAATGA TACCGTTGCA ACAATTGCAG 


AGACAGTTAA 


5160 


TGACTTATAT 


ATTTTTTTCA 


TTTTGTGAGG CTCCTTTTAA 


AATAAATTTG 


TTCTTGAATT 


5220 


ATAGGATAAA 


AATTCGTTGC 


ATATGAGCAA TTTAACGAAA 


AATTTACAAA 


ATCTTATCAA 


5280 


ACTCTTAAAG 


AAAGTTATTA 


AAATTCATTT TTATAAAATA 


CTTTTTAACA 


TTTAAATGTG 


5340 


GTACGCTATA 


AGTGTAATTT 


CATTGCATAC ATATTACACG 


ATTAAGAATG 


TGAAGGGGAC 


5400 


AGTTATCAAA 


TGAAAAATTT 


TAAGTGTTTA TTTGTATTAA 


TGTTAGCAGT 


CATTGTTTTT 


5460 


GCAGCAGCAT 


GTGGAAACTC 


AAGTTCTTTA GATAATCAAA 


AGAACGCTAG 


TAATGATTCG 


5520 


GATTCTAAAT 


CAGGAGGATA 


CAAACCTAAA GAATTAACCG 


TTCAATTTGT 


ACCTTCGCAA 


5580 


AATGCTGGAA 


CATTAGAAGC 


TAAAGCAAAA CCATTAGAAA 


AATTACTATC 


TAAAGAATTA 


5640 
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TCTAAAAAAG TTGATGTTGG TTTCTTACCA CCAACGGCAT ACACATTAGC ACATGATCAA 5760 

AAAGCAGCTG ATTTATTATT ACAAGCACAA CGTTTCGGTG TAAAAGAAGA TGGTTCAGCA 582 0 

5 AGTAAAGAAC TTGTAGATAG TTATAAATCA GAAATTCTTG TTAAAAAAGA CTCAAAAATT 5 8 80 

AAAAGCTTGA AAGATTTAAA AGGTAAGAAA ATTGCCTTAC AAGATGTAAC ATCAACTGCT 594 0 

GGATATACAT TCCCACTTGC GATGTTAAAA AACGAAGCAG GTATTAATGC AACTAAAGAT 600 0 

w 

ATGAAAATTG TGAATGTTAA AGGTCATGAC CAAGCAGTTA TCTCATTATT AAATGGAGAt 6060 

GTAGATGCTG CGGCTGTATT TAACGATGCA CGTAATACTG TGAAAAAAGA CCAACCAAAT 612 0 

75 GTATTTAAAG ACACACGAAT TTTAAAATTA ACACAAGCTA TTCCGAATGA CACAATTTCT 6180 

GTAAGACCAG ATATGGATAA AGATTTTCAA GAAAAATTGA AAAAAGCTTT TATAGACATT 624 0 

GCTAAATCAA AAGAAGGTCA CAAAATTATT AGCGAAGTTT ATTCACATGA AGGATACACA 63 00 

20 fJAAArGAAAG ATTCAAATTT caAPATTGTA AGAGAGTACG AAAAATTAGT TAAAGATATG 63 60 

AAAT AAT CAT TATTTAACAA ATGAATCATT AGCGAATTTG GTATTAAAAG CTTTCGTTCA 64 20 

ATAGATATAT TCTAGATTAA TATTGAAAAG CTAGGCGCTA AACTGAAACA GATATAGAAA 64 8 0 

25 

GGTG.TCGCTG TACATTTGAA ACCATTTGTA CACAGAAACC CAATGTCTAT GATATTTCAG 654 0 

TTTACCTTGG CTTTTCTTTA TTAAAGAAAG GTGTCAAACA TGAGTCAAAT CGAATTTAAA 66 00 

AACGTCAGTA AAGTCTATCC TAACGGTCAT GTAGGCTTGA AAAATATTAA CTTAAATATT 6660 

30 

GAAAAAGGTG AATTTG CAGT TATTGTCGGA CTATCTGGTG CTGGGAAATC CACGTTATTA 6720 

AGATCTGTAA ATCGTTTGCA TGATATCACG TCAGGTGAAA TTTTCATCCA AGGTAAATCA 67 80 

3$ ATCACTAAAG CCCATGGTAA AGCATTATTA GAAATGCGCC GAAATATAGG TATGATTTTC 684 0 

CAACATTTTA ATTTAGTTAA ACGGTCAAGT GTATTACGAA ATGTACTAAG TGGACGTGTA 6900 

GGTTATCACC CTACTTGGAA AATGGTATTA GGTTTATTCC CAAAAGAAGA CAAAATTAAG 6960 

40 

GCAATGGATG CACTAGAACG CGTCAATATC TTAGATAAAT ATAATCAACG CTCTGATGAA 7020 

TTATCAGGTG GCCAACAACA ACGTATATCT ATTGCACGTG CGCTATGCCA AGAATCTGAA 70 8 0 

ATTATTCTTG CAGATGAACC AGTTGCTTCA TTAGACCCAT TAACTACGAA ACAGGTTATG 714 0 

GATGATTTAA GAAAAATCAA CCAAGAATTA GGCATCACAA TTTTAATTAA TTTACATTTT 72 00 

GTTGACTTGG CAAAAGAATA TGGCACACGC ATCATTGGTT TACGTGATGG TGAAGTTGTC 7260 

50 TATGATGGTC CTGCATCTGA AGCAACAGAT GACGTATTTA GTGAAATATA TGGACGTACA 73 20 

ATTAAAGAAG ATGAAAAGCT AGGAGTGAAC TAACATGCCT TTAGAAATAC CTACAAAGTA 73 8 0 
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AATACCTCAA ATAGGTGATC TATTCAAACA AATGATTCCA CCTGATTTCG AGTATTTACA 7560 

ACAAATTACA ACGCCAATGT TAGATACCAT TCGAATGGcT ATCGTAAGTA CAGTATTAGG 7620 

TAGCATCGTT TCAATACCAA TTGCGTTATT ATGTGCTAGC AATATCGTTC ATCAAAAGTG 7680 

GATTTCAATA CCCTCGCGCT TTATTTTAAA TATAGTTCGT ACTATTCCAG ATTTGTTATT 7 74 0 

AGCAGCAATC TTTGTGGCTG TATTTGGAAT CGGTCAAATT CCAGGGATAT TAGCACTGTT 7 8 00 

TATTTTAACT ATCTGTATTA TTGGAAAATT ATTATATGAA TCATTGGAAA CGATAGATCC 7 860 

AGGTCCAATG GAAGCAATGA CGGCTGTTGG CGCTAATAAA ATAAAATGGA TTGTTTTCGG 7 920 

TGTTGTACCA CAAGCCATAT CGTCATTTAT GTCATACGTA TTATATGCAT TTGAAGTAAA 7980 

TATACGTGCT TCAGCTGTGC TTGGATTAGT CGGCGCTGGC GGTATTGGAT TGTTTTATGA 804 0 

TCAAACACTT GGTTTATTTC AATATCCAAA AACAGCAACG ATTATTTTAT TTACTTTAGT 8100 

TATCGTCGTC GTCATTGATT ACATCAGTAC GAAAGTGAGG GCACATCTCG CATGACACAG 8160 

GAAATAGCAA AATATAATGT TCACACAAAA GCACACAAAC GAAAATTGAT TAAAAGATGG 8220 

CTTATTGCAA TTGTCGTCTT AGCTATTATC ATCTGGGCAT TTGCAGGTGT ACCAAGTTTA 82 8 0 

GAACTTAAAA GTAAATCATT AGAAATCTTA AAATCCATAT TCAGCGGATT ATT C CATC CT 834 0 

GATATCAGCT ATATCTATAT ACCAGATGGC GAAGACTTAT TACGTGGTTT ACTTGAAACC 84 00 

TTTGCGATAG CCGTTGTAGG TACTTTCATC GCCGCAATTA TCTGTATTCC ATTAGCATTT 84 60 

CTAGGTGCAA ATAATATGGT AAAGCTACGC CCAGTTTCAG GTGTTAGCAA ATTTATTTTA 852 0 

AGTGTTATAC GTGTCTTCCC AGAAATTGTA ATGGCACTTA TATTTATCAA AGCTGTTGGC B58 0 

CCAGGTTCAT TTT CAGGTGT ATTAGCTTTA GGTATCCATT CCGTAGtATG CTTGGGAAAC 8 64 0 

TTTT AG CTG A AGATATTGAA GGTCTAGATT TCAGTGCTGT AGAATCATTA AAGGCCAGTG 8700 

GTGCDAATAA GATTAAAACA CTCGTATTTG CAGTCATACC ACAAATTATG CCTGCCTTTC 8760 

TATCACTCAT ACTTTATCGC TTTGAACTAA ACTTACGTTC AGCTTCTATA CTGGGGCTAA 8 820 

TTGGGGCTGG TGGTATCGGG ACACCACTCA TATTTGCCAT TCAAACACGT TCTTGGGACC 8 880 

GTGTAGGTAT TATATTAATC GGTTTAGTAC TAATGGTCGC AATTGTCGAT TTAATTTCCG 8 94 0 

GTTCAATCCG AAAACGTATT GTTTAACATT AAATCAGGAT ACTCCTAAAT AAGAAGTCCT 90 00 

ACCGTCTTAC GTTTCTCTAT TATAATAAAA ACAGCAGTGA AGAAAACTAT TGTTATAGTT 90 60 

AACTTCACTG CTGTTTTTAT AATATCTAAA TTTATTCTAT TTCAATTCCT TTAAATAACT 9120 

TTTACCGAAC TCTGGTAATG TTACGTTGAA ATTATCTGCT ATAGTTGCAC CGATAGAACT 9180 

GAATGTAGTA TCACTTTCTA GTGCATGACC ACCTTTAAAT TTCGGACTGT ACATAATTAC 924 0 
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